Next week (starting on Wednesday, to be more precise), I will begin my class on Perelman’s proof of the Poincaré conjecture. As I only have ten weeks in which to give this proof, I will have to move rapidly through some of the more basic aspects of Riemannian geometry which will be needed throughout the course. In particular, in this preliminary lecture, I will quickly review the basic notions of infinitesimal (or microlocal) Riemannian geometry, and in particular defining the Riemann, Ricci, and scalar curvatures of a Riemannian manifold. (The more “global” aspects of Riemannian geometry, for instance concerning the relationship between distance, curvature, injectivity radius, and volume, will be discussed later in this course.) This is a review only, in particular omitting any leisurely discussion of examples or motivation for Riemannian geometry; it is impossible to compress this subject into a single lecture, and I will have to refer you to a textbook on the subject for a more complete treatment (I myself am using the text “Riemannian geometry” by my colleague here at UCLA, Peter Petersen).
— Smooth manifolds —
Riemannian geometry takes place on smooth manifolds M of some dimension . Recall that a d-dimensional manifold (or d-manifold for short) M consists of the following structures:
- A topological space M (which for technical reasons we assume to be Hausdorff and second countable);
- An atlas of charts , which are homeomorphisms from open sets in M to open sets in , such that the cover M.
We say that the manifold M is smooth if the charts define a consistent smooth structure, in the sense that the maps is smooth (i.e. infinitely differentiable) on for every . One can then assert that a function from M to another space with a smooth structure (e.g. or ) is smooth if is smooth on for every ; a smooth map with an inverse which is also smooth is known as a diffeomorphism. The space of all smooth functions is denoted ; this is a topological algebra over the reals. More generally, we have the algebra for any open subset of M. (It is possible to view smooth manifolds more abstractly (and in a fully coordinate-independent fashion) by using the structure sheaf of algebras to define the smooth structure, rather than the atlas of charts, but we will not need to take this perspective here.)
Remark 1. The most intuitive way to view manifolds is from an extrinsic viewpoint: as subsets of some larger-dimensional space (e.g. viewing curves as subsets of the plane, surfaces as subsets of a Euclidean space such as ). While every smooth manifold can be viewed this way (thanks to the Whitney embedding theorem), we will in fact not use the extrinsic perspective at all in this course! Instead, we will rely exclusively on the intrinsic perspective – by studying the various structures on a smooth manifold M purely in terms of objects that can be defined in terms of the atlas. In fact, once we set up the most basic such structure – the tangent bundle – we will often not use the atlas directly at all (thus working in a “coordinate-free” fashion). However, the “local coordinates” provided by the charts in an atlas will be useful for computations at various junctures.
Remark 2. It is a surprising and unintuitive fact that a single topological manifold can have two distinct smooth structures which are not diffeomorphic to each other! This is most famously the case for 7-spheres , giving rise to exotic spheres. However, in the case of 3-manifolds – which is the focus of this course – all smooth structures are diffeomorphic (a result of Munkres and Whitehead; see also Smale for higher-dimensional variants), and so this subtlety need not concern us. [Aside: I was unable to find the relevant reference of Smale – does anyone know it?]
Remark 3. As is commutative, we will multiply by functions in this space on the left or on the right interchangeably. In noncommutative geometry, this algebra is replaced by a noncommutative algebra, and one has to take substantially more care with the order of multiplication, but we will not use noncommutative geometry here.
We will be interested in various vector bundles over a smooth manifold M. A vector bundle V is a collection of (real) vector spaces of a fixed dimension k (the fibres of the bundle) associated to each point , whose disjoint union can itself be given the structure of a smooth (d+k-dimensional) manifold, in such a way that for all sufficiently small neighbourhoods U of any given point x, the set has a trivialisation, i.e. there is a diffeomorphism between and , with each fibre being identified in a linearly isomorphic way with the vector space . A (global) section of a vector bundle V is a smooth map such that for every . The space of all sections is denoted ; it is a vector space over , and furthermore is a module over . We will sometimes also be interested in local sections on some open subset U of M; the space of such sections (which form a module over ) will be denoted . All of the discussion below on the global manifold M can be easily adapted to local open sets U in this manifold (indeed, one can interpret U itself as a manifold); as all our computations will be entirely local (and because of the ready availability of smooth cutoff functions), the theory on M and the theory on U will be completely compatible.
Example 1. The space can be canonically identified with the space of sections of the trivial line bundle .
In Riemannian geometry, the most fundamental vector bundle over a manifold M is the tangent bundle TM, defined by letting the tangent space at a point be the space of all tangent vectors in M at x. A tangent vector can be defined as a vector which can be expressed as the (formal) derivative of some smooth curve which passes through x at time zero, thus . One can express these tangent vectors concretely by using any chart that covers x.
To be somewhat informal, given any point and tangent vector , one can define a trajectory of points for all “infinitesimal” t, which is only defined up to an error of (as measured, for instance, in some coordinate chart), but whose derivative at is equal to v. Thus, while the global manifold M need not have any reasonable notion of vector addition, we do have this infinitesimal notion of translation by a tangent vector which is well-defined up to second-order errors.
Given a tangent vector and a smooth function , we can define the directional derivative by the formula
(or, a bit more formally, for any curve with and ). This is a linear functional on which annihilates constants and obeys the Leibniz rule
Conversely, one can define the tangent space to be the space of all linear functionals on with the above two properties, though we will not need to do so here.
A section of M is known as a vector field; it assigns a tangent vector to each point . A vector field X determines a first-order differential operator , defined by setting . From (2), we see that is a derivation, i.e. it is linear over and obeys the Leibniz rule
Conversely, one can easily show that every derivation on arises uniquely in this manner. This provides a convenient means to define new types of vector fields. For example, if X and Y are two vector fields, one can easily see (from (3)) that the commutator is also a derivation, and must thus be given by another vector field [X,Y], thus
for all vector fields X, Y and all scalar fields f.
Example 2. Suppose we have a local coordinate chart . The standard first-order differential operators induced by the coordinates on can be viewed as vector fields, and pulled back via to vector fields on U. These in fact form a frame for U since they span the tangent space at every point. Since and commute in , we see that .
Exercise 1. Show that the map endows the space of vector fields with the structure of an abstract Lie algebra. Also establish the Leibniz rule
for all and .
Various operations on finite-dimensional vector spaces generalise easily to vector bundles. For instance, every finite-dimensional vector space V has a dual , and similarly every vector bundle V also has a dual bundle , whose fibres are the dual to the fibres of V; one can also view as the space of -linear functionals from V to . Similarly, given two vector bundles over M, one can define the direct sum , the tensor product , the space of fibre-wise linear transformations from V to W, the symmetric powers and exterior powers , and so forth. The construction of all of these concepts is straightforward but rather tedious, and will be omitted here.
Applying these constructions to the tangent bundle TM, one gets a variety of useful bundles for doing Riemannian geometry:
- The bundle is the cotangent bundle; elements of are cotangent vectors.
- Sections of are known as k-forms.
- Sections of are known as rank (k,l) tensor fields, and individual elements of this bundle are rank (k,l) tensors. Many tensors of interest obey various symmetry or antisymmetry properties, for instance k-forms are totally anti-symmetric rank (0,k) tensors. (To fully enumerate the various symmetry properties available to tensors is essentially equivalent to the finite-dimensional representation theory of the permutation group, which is a beautiful and important subject but will not be discussed here.)
It is convenient to use abstract index notation, denoting rank (k,l) tensor fields using k superscripted Greek indices and l subscripted Greek indices, thus for instance denotes a rank (1,3) tensor. One should think of these indices as placeholders; if one chooses a frame for the tangent bundle (i.e. a collection of vector fields which form a basis for the tangent space at every point), which induces the associated dual frame for the cotangent bundle, then this notation can be viewed as describing the coefficients of the tensor in terms of the basis generated by such frames, thus for instance
But it is perhaps better to view a tensor such as as existing independently of any choice of frame, in which case the labels are abstract placeholders.
Example 3. We continue Example 2. A local coordinate chart generates a (local) frame with an associated dual frame . These frames can be slightly easier to work with for computations than general frames, because we automatically have as already noted in Example 2. On the other hand, it is often convenient to work in frames that don’t come from coordinate charts in order to obtain other good properties; in particular, it is very convenient to work in orthonormal frames, which are usually unavailable if one restricts attention to frames arising from coordinate charts.
We use the usual (and very handy) Einstein summation convention: repeated indices (with each repeated index appearing exactly once as a superscript and once as a subscript) are implicitly summed over a choice of frame (the exact choice is not important). For instance, the rank (0,4) tensor is defined to be the tensor which is given by the formula
for any choice of frame (one can easily verify that this definition is independent of the choice of frame). We will also apply this summation convention when the Greek labels are replaced with concrete counterparts arising from a frame, thus for instance we can now abbreviate (6) as
— Connections —
We have seen that vector fields allow us to differentiate scalar functions to obtain a differentiated function . Furthermore, this concept obeys the Leibniz rule (3), and is linear over in X, or in other words
for all and . As a consequence, one can interpret as a -linear functional on , which is identified with a section of the cotangent bundle, thus .
Now suppose one wants to differentiate , where is now a section of a bundle V. It turns out that there is now more than one good notion of differentiation. Each such notion can be formalised by the concept of an (linear) connection:
Definition 1. A connection on a bundle V is an assignment of a section (the covariant derivative of f in the direction X via the connection ) to each vector field and section , in such a way that is bilinear in f and X, that the Leibniz rule (3) is obeyed for and (or vice versa), and the linearity rule (8) is obeyed for all and .
If is such that for all vector fields X, we say that f is parallel to the connection .
A connection on the tangent bundle TM is known as an affine connection.
Remark 4. Informally, a connection assigns an infinitesimal linear isomorphism (the parallel transport map) to each infinitesimal tangent vector , in a manner which is linear in v for fixed x. The connection between this informal definition and the above formal one is given by the formula . One can make this informal definition more precise (e.g. using non-standard analysis) but we will not do so here. An alternate definition of a connection is as a complementary subbundle to the vertical bundle in TV, known as a horizontal bundle, obeying some additional linearity conditions in the vertical variable.
Once one has a connection on a bundle V, one automatically can define a connection on the dual bundle and more generally on tensor powers , by enforcing all possible instances of the Leibniz rule, e.g.
for all rank (2,1) tensors f and rank (1,1) tensors g. (It is a straightforward but tedious task to verify that all the Leibniz rules are consistent with each other, and that (9) and its relatives uniquely define a connection on every tensor power of V.) In particular, any connection on the tangent bundle (which is the case of importance in Riemannian geometry) naturally induces a connection on the cotangent bundle and the bundle of rank (k,l) tensors.
Here it is important to note that the indices are abstract, rather than corresponding to some frame: for instance, if is a connection on the tangent bundle TM, then after choosing a frame , it is usually not the case that the coefficient of a vector field at a is equal to the derivative of that component of f. Instead, one has a relationship of the form
where for each , the Christoffel symbol of the connection relative to the frame is a smooth function on M. It is important to note that Christoffel symbols are not tensors, because the expression turns out to depend on the choice of frame.
Using the Leibnitz rule repeatedly, it is not hard to use (10) to give a formula for the components of derivatives of other tensors, e.g.
for any 1-form ,
for any rank (0,2) tensor g, and so forth.
We have remarked that Christoffel symbols are not tensors. On the other hand, because is linear in X, we can legitimately define a tensor field , which is a section of , thus . It is also possible to express the difference of two connections as a tensor:
Exercise 2. Let be two connections on TM. Show that there exists a unique rank (1,2) tensor such that
for all vector fields . Now interpret the Christoffel symbol of a connection on TM relative to a frame as the difference of that connection with the flat connection induced by the trivialisation of the tangent bundle induced by that frame.
Let be a connection on TM. We say that this connection is torsion-free if we have the pleasant identity
Exercise 3. Show that is torsion-free if and only if
for all vector fields X, Y (or in coordinate-free notation, ).
Remark 5. Roughly speaking, the torsion-free connections are those which have a good notion of an infinitesimal parallelogram with corners for some infinitesimal t, such that each edge is the parallel transport of the opposing edge to error . (Without the torsion-free hypothesis, the error is merely .)
It would be nice if (14) extended to tensor fields f. This is true for flat connections, but false in general. The defect in (14) for such fields is measured by the curvature tensor of the connection , defined by the formula
for all vector fields X,Y,Z (cf. (4)). One easily sees that R is indeed a section of and can thus be viewed as a rank (1,3) tensor.
Exercise 4. If is a torsion-free connection on TM, and is the tensor form of the curvature R, defined by requiring that
then show that
for all vector fields . What is the analogue of (18) if is replaced by a rank (k,l) tensor?
Connections describe a way to transport tensors as one moves from point to point in the manifold. There is another way to transport tensors, which is induced by diffeomorphisms of the base manifold; this transportation procedure maps points to points , maps tangent vectors to tangent vectors (defined by requiring that the chain rule hold for all curves ) and then maps other tensors in the unique manner consistent with the tensor operations (e.g. ). This procedure is important for describing symmetries of tensor fields (consider, for instance, what it means for the vector field in to be invariant under rotations around the origin). To relate this diffeomorphism transport to infinitesimal differential geometry, though, we have to look at an infinitesimal diffeomorphism, which we can view as the derivative of a smoothly varying family of diffeomorphisms, with equal to the identity. By chasing all the definitions we see that is just a vector field X. The infinitesimal rate of change of a tensor field v under this diffeomorphism is known as the Lie derivative of v with respect to the vector field X (it does not depend on any aspect of other than its infinitesimal vector field). On scalars f, it agrees with directional derivative
while on vector fields Y, it agrees with the commutator:
and its action on all other tensors can be given by the Leibniz rule (as is the case for connections). It should be emphasised, though, that the Lie derivative is not a connection, because it is not linear (over ) in X; in general.
— Riemannian manifolds and curvature tensors —
We now specialise our attention from smooth manifolds to our main topic of interest, namely Riemannian manifolds. Informally, a Riemannian manifold is a manifold equipped with notions of length, angle, area, etc. which are infinitesimally isomorphic at every point to the corresponding notions in Euclidean space. In Euclidean space, all these geometric notions can be defined in terms of a positive definite inner product, and Riemannian manifolds are similarly founded on a positive definite Riemannian metric.
Definition 2. A Riemannian manifold (M,g) is a smooth manifold M, together with a Riemannian metric on M, i.e. a section of which is positive definite in the sense that is a positive-definite inner product on for every point x.
We now use the metric g to build several other tensors of interest. Firstly, we have the inverse metric , which is the unique rank (2,0) tensor that inverts the (0,2) tensor g in the sense that is the identity section of ; this tensor is also symmetric and positive-definite. One can use these tensors to raise and lower the indices of other tensors; for instance, given a rank (0,2) tensor , one can define the rank (1,1) tensors and and the rank (2,0) tensor . We will generally only use these conventions when there is enough symmetry that there is no danger of ambiguity.
Remark 6. All Riemannian manifolds can be viewed extrinsically (locally, at least) as subsets of a Euclidean space, thanks to the famous Nash embedding theorem. But we will not need this extrinsic viewpoint in this course.
After the metric, the next fundamental object in Riemannian geometry is the Levi-Civita connection.
Fundamental theorem of Riemannian geometry. Let (M,g) be a Riemannian manifold. Then there exists a unique affine connection (which is known as the Levi-Civita connection) which is torsion-free and respects the metric g in the sense that .
Exercise 5. Prove this theorem. (Hint: one can either (a) use abstract index notation and study expressions such as , (b) use coordinate-free notation and study expressions such as , or (c) use local coordinates (e.g. use a frame arising from a chart as in Example 2) and work with the Christoffel symbols . It is instructive to do this exercise in all three possible ways in order to appreciate the equivalence (and relative advantages and disadvantages) between these three perspectives.
Geometrically, the condition asserts that parallel transport by the Levi-Civita connection is an isometry. At a computational level, it means (in conjunction with the Leibnitz rule) that covariant differentiation using the Levi-Civita connection commutes with the raising and lowering operations, for instance given a vector field we have
and so we may safely use raising and lowering operations in the presence of Levi-Civita covariant derivatives without much risk of serious error. We can also raise and lower the covariant derivative itself, defining
This leads to the covariant Laplacian (or Bochner Laplacian)
defined on all tensor fields (for instance, when applied to scalar fields it becomes the trace of the Hessian, and is known as the Laplace-Beltrami operator). When applied to non-scalar fields, the covariant Laplacian differs slightly from the Hodge Laplacian (or Laplace-de Rham operator) by a lower order term which is given by the Weitzenböck identity.
As discussed earlier, all connections on TM have a curvature tensor in . The curvature of the Levi-Civita connection is known as the Riemann curvature tensor , thus
One also writes in co-ordinate free notation by defining for vector fields X,Y,Z by the formula
or equivalently as .
Because respects g, one eventually deduces from (24) and the Leibniz rule that is skew-adjoint in the indices:
It is also clearly skew-symmetric in the indices. Also, from the analogue of (24) for 1-forms, i.e.
and the torsion-free nature of the connection, we have
for all scalar fields f. Cyclically summing this in we obtain the first Bianchi identity
Exercise 6. Show that the above three symmetries of imply that is a self-adjoint section of , and that these conditions are in fact equivalent in three and fewer dimensions. (The claim fails in four and higher dimensions; see comments.)
Exercise 7. By differentiating (24) and cyclically summing, establish the second Bianchi identity
Exercise 8. Show that a Riemannian manifold (M,g) is locally isomorphic (as Riemannian manifolds) to Euclidean space if and only if the Riemann curvature tensor vanishes. (Hint: one direction is easy. For the other direction, the quickest way is to apply the Frobenius theorem to obtain a local trivialisation of the tangent bundle which is flat with respect to the Levi-Civita connection.) This illustrates the point that the Riemann curvature captures all the local obstructions that prevent a Riemannian manifold from being flat. (Compare this situation with the superficially similar subject of symplectic geometry, in which Darboux’s theorem guarantees that there are no local obstructions whatsoever to a symplectic manifold being flat.)
The Riemann curvature measures the “infinitesimal monodromy” of parallel transport. For our applications we will need to study a slightly different curvature, the Ricci curvature , which measures how much the volume-radius relationship on infinitesimal sectors has been distorted from the Euclidean one. (This will not be obvious presently, as we have not yet defined the volume measure on a Riemannian manifold.) It is defined as the trace of the Riemannian tensor, or more precisely as
(One could also contract other indices than these, but due to the various symmetry properties of the Riemann tensor, one ends up with essentially the same tensor as a consequence.) We also write for when X, Y are vector fields. The symmetries of easily imply that is a symmetric rank (2,0) tensor – just like the metric g! This observation will of course be vital for defining Ricci flow later. (This observation, as well as a similar observation for the stress-energy tensor, was also decisive in leading Einstein to the equations of general relativity, but that’s a whole other story.)
We can take the trace of the Ricci tensor to form the scalar curvature
up to normalisations, R can also be viewed as the trace of the Riemann tensor (viewed as a section of ). The scalar curvature measures how the relationship of volume of infinitesimal balls to their radius is distorted by the geometry.
The relationship between the Riemannian, Ricci, and scalar curvatures depends on the dimension:
- In one dimension, all three curvatures vanish; there are no degrees of freedom.
- In two dimensions, the Riemannian and Ricci curvatures are just multiples of the scalar curvature (by some tensor depending algebraically on the metric); there is only one degree of freedom.
- In three dimensions, the Riemann tensor is a linear combination of the Ricci curvature (see also Exercise 8 below). On the other hand, the scalar curvature does not control Ricci (or Riemann); the Ricci tensor contains an additional trace-free component. (However, once we start evolving by Ricci flow, we shall see that the Hamilton-Ivey pinching phenomenon will allow us to use the scalar curvature to mostly control Ricci and hence Riemann near singularities.)
- In four and higher dimensions, the Riemann tensor is not fully controlled by the Ricci curvature; there is an additional component to the Riemann tensor, namely the Weyl tensor. Similarly, the Ricci curvature is not fully controlled by the scalar curvature.
Exercise 9. (Ricci controls Riemann in three dimensions) In three dimensions, suppose that the (necessarily real) eigenvalues of the Riemann curvature at a point x (viewed as an element of ) are . Show that the eigenvalues of the Ricci curvature at x (viewed as an element of are . Conclude in particular that
where we endow the (fibres of the) spaces and with the Hilbert (or Hilbert-Schmidt) structure induced by the metric g.
Remark 7. The fact that Ricci controls Riemann in three dimensions, without itself degenerating into scalar curvature or zero, seems to explain why Ricci flow is especially powerful in three dimensions; it is still useful, but harder to work with, in two dimensions, useless in one dimension, and too weak to fully control the geometry in four and higher dimensions. It seems to me that the special nature of three dimensions stems from the fact that it is the unique number of dimensions in which 2-forms (which are naturally associated with curvature) are Hodge dual to vector fields (as opposed to scalars, or to higher-rank tensors); this is the same special feature of three dimensions which gives us the cross product (as opposed to the more general wedge product).
Because of the variety of curvatures, there are various notions of what it means for a manifold to have “non-negative curvature” at some point.
Definition 3. Let x be a point on a Riemannian manifold (M,g). We say that x has
- non-negative scalar curvature if ;
- non-negative Ricci curvature if as a quadratic form on TM, i.e. for all vectors ;
- non-negative sectional curvature if for all vectors ;
- non-negative Riemann curvature if as a quadratic form on , thus for all two-forms .
It is not hard to show that, in arbitrary dimension, 4. implies 3. implies 2. implies 1. In one dimension, these conditions are vacuously true; in two dimensions; these conditions are all equivalent; and in three dimensions, non-negative Riemann curvature is equivalent to non-negative sectional curvature (because every 2-form is the wedge product of two one-forms in this case) but these conditions are otherwise distinct. In four and higher dimensions all of these conditions are distinct. One can also define the analogous notions of positive curvature (or negative curvature, or non-positive curvature) in the usual manner.
[Geometrically, positive scalar curvature means that infinitesimal balls have slightly less volume than in the Euclidean case; positive Ricci curvature means that infinitesimal sectors have slightly less volume than in the Euclidean case; and positive sectional curvature means that all infinitesimally geodesic two-dimensional surfaces have positive gaussian curvature. I don’t know of a geometrically simple way to describe positive Riemann curvature.]
A couple lectures from now, we shall compute these curvatures explicitly in a number of model cases (such as that of a homogeneous space). For now, we give a “cartoon” or “schematic” description of these curvatures when viewed in some local coordinate system , using the associated frame as in Example 2 to express all tensors as arrays of numbers. Writing , we thus schematically have the following relationships:
- The Christoffel symbols are schematically of the form . Thus a covariant derivative of a tensor w looks schematically like , and the Laplacian looks like .
- The Riemann curvature tensor and the Ricci curvature tensor schematically take the form .
- The scalar curvature R schematically takes the form . (Thus the scalar curvature has the same scaling as the Laplacian.)
Remark 8. Note how in all of these expressions, the “number of derivatives” and “number of g’s” stays fixed among all terms in a given expression. This can be viewed as an example of dimensional analysis in action, and is useful for catching errors in manipulations with these sorts of expressions. From a more representation-theoretic viewpoint, what is going on is that all of the above expressions have constant weight with respect to the joint (commuting) actions of the dilation operation on the underlying coordinate chart (which essentially controls the number of derivatives that appear) and the homogeneity operation (which, naturally enough, controls the number of g’s that appear).
[Update, Mar 27: more remarks added; various corrections.]