Next week (starting on Wednesday, to be more precise), I will begin my class on Perelman’s proof of the Poincaré conjecture. As I only have ten weeks in which to give this proof, I will have to move rapidly through some of the more basic aspects of Riemannian geometry which will be needed throughout the course. In particular, in this preliminary lecture, I will quickly review the basic notions of infinitesimal (or microlocal) Riemannian geometry, and in particular defining the Riemann, Ricci, and scalar curvatures of a Riemannian manifold. (The more “global” aspects of Riemannian geometry, for instance concerning the relationship between distance, curvature, injectivity radius, and volume, will be discussed later in this course.) This is a review only, in particular omitting any leisurely discussion of examples or motivation for Riemannian geometry; it is impossible to compress this subject into a single lecture, and I will have to refer you to a textbook on the subject for a more complete treatment (I myself am using the text “Riemannian geometry” by my colleague here at UCLA, Peter Petersen).

— Smooth manifolds —

Riemannian geometry takes place on smooth manifolds M of some dimension $d = 0,1,2,\ldots$. Recall that a d-dimensional manifold (or d-manifold for short) M consists of the following structures:

• A topological space M (which for technical reasons we assume to be Hausdorff and second countable);
• An atlas of charts $\phi_\alpha: U_\alpha \to V_\alpha$, which are homeomorphisms from open sets $U_\alpha$ in M to open sets $V_\alpha$ in ${\Bbb R}^d$, such that the $U_\alpha$ cover M.

We say that the manifold M is smooth if the charts $\phi_\alpha$ define a consistent smooth structure, in the sense that the maps $\phi_\alpha \circ \phi_\beta^{-1}$ is smooth (i.e. infinitely differentiable) on $\phi_\beta( U_\alpha \cap U_\beta )$ for every $\alpha, \beta$. One can then assert that a function $f: M \to X$ from M to another space with a smooth structure (e.g. ${\Bbb R}$ or ${\Bbb C}$) is smooth if $f \circ \phi_\alpha^{-1}$ is smooth on $V_\alpha$ for every $\alpha$; a smooth map with an inverse which is also smooth is known as a diffeomorphism. The space of all smooth functions $f: M \to {\Bbb R}$ is denoted $C^\infty(M)$; this is a topological algebra over the reals. More generally, we have the algebra $C^\infty(U)$ for any open subset of M. (It is possible to view smooth manifolds more abstractly (and in a fully coordinate-independent fashion) by using the structure sheaf of algebras $C^\infty(U)$ to define the smooth structure, rather than the atlas of charts, but we will not need to take this perspective here.)

Remark 1. The most intuitive way to view manifolds is from an extrinsic viewpoint: as subsets of some larger-dimensional space (e.g. viewing curves as subsets of the plane, surfaces as subsets of a Euclidean space such as ${\Bbb R}^3$). While every smooth manifold can be viewed this way (thanks to the Whitney embedding theorem), we will in fact not use the extrinsic perspective at all in this course! Instead, we will rely exclusively on the intrinsic perspective – by studying the various structures on a smooth manifold M purely in terms of objects that can be defined in terms of the atlas. In fact, once we set up the most basic such structure – the tangent bundle – we will often not use the atlas directly at all (thus working in a “coordinate-free” fashion). However, the “local coordinates” provided by the charts in an atlas will be useful for computations at various junctures. $\diamond$

Remark 2. It is a surprising and unintuitive fact that a single topological manifold can have two distinct smooth structures which are not diffeomorphic to each other! This is most famously the case for 7-spheres $S^7$, giving rise to exotic spheres. However, in the case of 3-manifolds – which is the focus of this course – all smooth structures are diffeomorphic (a result of Munkres and Whitehead; see also Smale for higher-dimensional variants), and so this subtlety need not concern us. $\diamond$ [Aside: I was unable to find the relevant reference of Smale – does anyone know it?]

Remark 3. As $C^\infty(M)$ is commutative, we will multiply by functions in this space on the left or on the right interchangeably. In noncommutative geometry, this algebra is replaced by a noncommutative algebra, and one has to take substantially more care with the order of multiplication, but we will not use noncommutative geometry here. $\diamond$

We will be interested in various vector bundles over a smooth manifold M. A vector bundle V is a collection of (real) vector spaces $V_x$ of a fixed dimension k (the fibres of the bundle) associated to each point $x \in M$, whose disjoint union $V = \biguplus_{x \in M} V_x$ can itself be given the structure of a smooth (d+k-dimensional) manifold, in such a way that for all sufficiently small neighbourhoods U of any given point x, the set $\biguplus_{x \in U} V_x$ has a trivialisation, i.e. there is a diffeomorphism between $\biguplus_{x \in U} V_x$ and $U \times {\Bbb R}^k$, with each fibre $V_x$ being identified in a linearly isomorphic way with the vector space $\{x\} \times {\Bbb R}^k \equiv {\Bbb R}^k$. A (global) section of a vector bundle V is a smooth map $f: M \to V$ such that $f(x) \in V_x$ for every $x \in V$. The space of all sections is denoted $\Gamma(V)$; it is a vector space over ${\Bbb R}$, and furthermore is a module over $C^\infty(M)$. We will sometimes also be interested in local sections $f: U \to V$ on some open subset U of M; the space of such sections (which form a module over $C^\infty(U)$) will be denoted $\Gamma(U \to V)$. All of the discussion below on the global manifold M can be easily adapted to local open sets U in this manifold (indeed, one can interpret U itself as a manifold); as all our computations will be entirely local (and because of the ready availability of smooth cutoff functions), the theory on M and the theory on U will be completely compatible.

Example 1. The space $C^\infty(M)$ can be canonically identified with the space of sections $\Gamma(M \times {\Bbb R})$ of the trivial line bundle $M \times {\Bbb R}$. $\diamond$

In Riemannian geometry, the most fundamental vector bundle over a manifold M is the tangent bundle TM, defined by letting the tangent space $T_x M$ at a point $x \in M$ be the space of all tangent vectors in M at x. A tangent vector $v \in T_x M$ can be defined as a vector which can be expressed as the (formal) derivative $v=\gamma'(0)$ of some smooth curve $\gamma: (-\varepsilon, \varepsilon) \to M$ which passes through x at time zero, thus $\gamma(0) = x$. One can express these tangent vectors concretely by using any chart that covers x.

To be somewhat informal, given any point $x \in M$ and tangent vector $v \in T_x M$, one can define a trajectory of points $x + tv + O(t^2) \in M$ for all “infinitesimal” t, which is only defined up to an error of $O(t^2)$ (as measured, for instance, in some coordinate chart), but whose derivative at $t=0$ is equal to v. Thus, while the global manifold M need not have any reasonable notion of vector addition, we do have this infinitesimal notion of translation by a tangent vector which is well-defined up to second-order errors.

Given a tangent vector $v \in T_x M$ and a smooth function $f \in C^\infty(M)$, we can define the directional derivative $\nabla_v f(x)$ by the formula

$\nabla_v f(x) := \lim_{t \to 0} \frac{f(x+vt+O(t^2)) - f(x)}{t}$ (1)

(or, a bit more formally, $\nabla_v f(x) = \frac{d}{dt} f(\gamma(t))$ for any curve $\gamma: (-\varepsilon, \varepsilon) \to M$ with $\gamma(0)=x$ and $\gamma'(0)=v$). This is a linear functional on $C^\infty(M)$ which annihilates constants and obeys the Leibniz rule

$\nabla_v(fg)(x) = f(x) \nabla_v g(x) + \nabla_v f(x) g(x)$. (2)

Conversely, one can define the tangent space $T_x V$ to be the space of all linear functionals on $C^\infty(M)$ with the above two properties, though we will not need to do so here.

A section $X \in \Gamma(TM)$ of M is known as a vector field; it assigns a tangent vector $X(x) \in T_x M$ to each point $x \in M$. A vector field X determines a first-order differential operator $\nabla_X: C^\infty(M) \to C^\infty(M)$, defined by setting $\nabla_X f(x) := \nabla_{X(x)} f(x)$. From (2), we see that $\nabla_X$ is a derivation, i.e. it is linear over ${\Bbb R}$ and obeys the Leibniz rule

$\nabla_X (fg) = f \nabla_X g + (\nabla_X f) g$. (3)

Conversely, one can easily show that every derivation on $C^\infty(M)$ arises uniquely in this manner. This provides a convenient means to define new types of vector fields. For example, if X and Y are two vector fields, one can easily see (from (3)) that the commutator ${}[\nabla_X, \nabla_Y] := \nabla_X \nabla_Y - \nabla_Y \nabla_X$ is also a derivation, and must thus be given by another vector field [X,Y], thus

$\nabla_X \nabla_Y f - \nabla_Y \nabla_X f - \nabla_{[X,Y]} f = 0$ (4)

for all vector fields X, Y and all scalar fields f.

Example 2. Suppose we have a local coordinate chart $\phi: U \to V \subset {\Bbb R}^d$. The standard first-order differential operators $\frac{d}{dx^1},\ldots,\frac{d}{dx^d}$ induced by the coordinates $x^1,\ldots,x^d$ on ${\Bbb R}^d$ can be viewed as vector fields, and pulled back via $\phi$ to vector fields $\phi^* \frac{d}{dx^1}, \ldots, \phi^* \frac{d}{dx^d}$ on U. These in fact form a frame for U since they span the tangent space at every point. Since $\frac{d}{dx^i}$ and $\frac{d}{dx^j}$ commute in ${\Bbb R}^d$, we see that ${}[ \phi^* \frac{d}{dx^i}, \phi^* \frac{d}{dx^j} ] = 0$. $\diamond$

Exercise 1. Show that the map $(X, Y) \mapsto [X,Y]$ endows the space $\Gamma(TM)$ of vector fields with the structure of an abstract Lie algebra. Also establish the Leibniz rule

${}[X, fY] = (\nabla_X f) Y + f [X,Y]$ (5)

for all $X, Y \in \Gamma(TM)$ and $f \in C^\infty(M)$. $\diamond$

Various operations on finite-dimensional vector spaces generalise easily to vector bundles. For instance, every finite-dimensional vector space V has a dual $V^*$, and similarly every vector bundle V also has a dual bundle $V^*$, whose fibres $V_x^*$ are the dual to the fibres $V_x$ of V; one can also view $V^*$ as the space of $C^\infty(M)$-linear functionals from V to $C^\infty(M)$. Similarly, given two vector bundles $V, W$ over M, one can define the direct sum $V \oplus W$, the tensor product $V \otimes W$, the space $\hbox{Hom}(V,W)$ of fibre-wise linear transformations from V to W, the symmetric powers $\hbox{Sym}^k(V)$ and exterior powers $\bigwedge^k(V)$, and so forth. The construction of all of these concepts is straightforward but rather tedious, and will be omitted here.

Applying these constructions to the tangent bundle TM, one gets a variety of useful bundles for doing Riemannian geometry:

• The bundle $T^* M := (TM)^*$ is the cotangent bundle; elements of $T^*_x M$ are cotangent vectors.
• Sections of $\bigwedge^k(T^* M)$ are known as k-forms.
• Sections of $(TM)^{\otimes k} \otimes (T^* M)^{\otimes l}$ are known as rank (k,l) tensor fields, and individual elements of this bundle are rank (k,l) tensors. Many tensors of interest obey various symmetry or antisymmetry properties, for instance k-forms are totally anti-symmetric rank (0,k) tensors. (To fully enumerate the various symmetry properties available to tensors is essentially equivalent to the finite-dimensional representation theory of the permutation group, which is a beautiful and important subject but will not be discussed here.)

It is convenient to use abstract index notation, denoting rank (k,l) tensor fields using k superscripted Greek indices and l subscripted Greek indices, thus for instance $\hbox{Riem} = \hbox{Riem}_{\alpha \beta \gamma}^\delta$ denotes a rank (1,3) tensor. One should think of these indices as placeholders; if one chooses a frame $(e_a)_{a \in A}$ for the tangent bundle (i.e. a collection of vector fields which form a basis for the tangent space at every point), which induces the associated dual frame $(e^a)_{a \in A}$ for the cotangent bundle, then this notation can be viewed as describing the coefficients of the tensor in terms of the basis generated by such frames, thus for instance

$\hbox{Riem} = \sum_{a,b,c,d \in A} \hbox{Riem}_{abc}^d e^a \otimes e^b \otimes e^c \otimes e_d$. (6)

But it is perhaps better to view a tensor such as $\hbox{Riem}_{\alpha \beta \gamma}^\delta$ as existing independently of any choice of frame, in which case the labels $\alpha, \beta, \gamma, \delta$ are abstract placeholders.

Example 3. We continue Example 2. A local coordinate chart $\phi: U \to {\Bbb R}^d$ generates a (local) frame $e_a := \phi^* \frac{d}{dx^a}$ with an associated dual frame $e^a := \phi^*(dx^a)$. These frames can be slightly easier to work with for computations than general frames, because we automatically have ${}[e_a,e_b] = 0$ as already noted in Example 2. On the other hand, it is often convenient to work in frames that don’t come from coordinate charts in order to obtain other good properties; in particular, it is very convenient to work in orthonormal frames, which are usually unavailable if one restricts attention to frames arising from coordinate charts. $\diamond$

We use the usual (and very handy) Einstein summation convention: repeated indices (with each repeated index appearing exactly once as a superscript and once as a subscript) are implicitly summed over a choice of frame (the exact choice is not important). For instance, the rank (0,4) tensor $X_{\alpha \beta \sigma \mu} := \hbox{Riem}_{\alpha \beta \gamma}^\delta \hbox{Riem}_{\delta \sigma \mu}^\gamma$ is defined to be the tensor which is given by the formula

$X_{absm} = \sum_{g,d \in A} \hbox{Riem}_{abg}^d \hbox{Riem}_{dsm}^g$ (7)

for any choice of frame $(e_a)_{a \in A}$ (one can easily verify that this definition is independent of the choice of frame). We will also apply this summation convention when the Greek labels are replaced with concrete counterparts arising from a frame, thus for instance we can now abbreviate (6) as

$\hbox{Riem} = \hbox{Riem}_{abc}^d e^a \otimes e^b \otimes e^c \otimes e_d$. (6′)

— Connections —

We have seen that vector fields $X \in \Gamma(TM)$ allow us to differentiate scalar functions $f \in C^\infty(M)$ to obtain a differentiated function $\nabla_X f$. Furthermore, this concept obeys the Leibniz rule (3), and is linear over $C^\infty(M)$ in X, or in other words

$\nabla_{gX+Y} f = g \nabla_X f + \nabla_Y f$ (8)

for all $g \in C^\infty(M)$ and $X, Y \in \Gamma(TM)$. As a consequence, one can interpret $X \mapsto \nabla_X f$ as a $C^\infty(M)$-linear functional on $\Gamma(TM)$, which is identified with a section $df \in \Gamma(T^* M)$ of the cotangent bundle, thus $\nabla_X f = df(X)$.

Now suppose one wants to differentiate $\nabla_X f$, where $f \in \Gamma(V)$ is now a section of a bundle V. It turns out that there is now more than one good notion of differentiation. Each such notion can be formalised by the concept of a (linear) connection:

Definition 1. A connection $\nabla$ on a bundle V is an assignment of a section $\nabla_X f \in \Gamma(V)$ (the covariant derivative of f in the direction X via the connection $\nabla$) to each vector field $X \in \Gamma(TM)$ and section $f \in \Gamma(V)$, in such a way that $(f,X) \mapsto \nabla_X f$ is bilinear in f and X, that the Leibniz rule (3) is obeyed for $f \in C^\infty(M)$ and $g \in \Gamma(V)$ (or vice versa), and the linearity rule (8) is obeyed for all $g \in C^\infty(M)$ and $X,Y \in \Gamma(TM)$.

If $f \in \Gamma(V)$ is such that $\nabla_X f = 0$ for all vector fields X, we say that f is parallel to the connection $\nabla$.

A connection on the tangent bundle TM is known as an affine connection.

Remark 4. Informally, a connection assigns an infinitesimal linear isomorphism $\phi_v: V_x \to V_{x+v}$ (the parallel transport map) to each infinitesimal tangent vector $v \in V$, in a manner which is linear in v for fixed x. The connection between this informal definition and the above formal one is given by the formula $\nabla_X f(x) = \lim_{t \to 0} \frac{\phi_{t X(x)}^{-1}(f(x+tX(x))) - f(x)}{t}$. One can make this informal definition more precise (e.g. using non-standard analysis) but we will not do so here. An alternate definition of a connection is as a complementary subbundle to the vertical bundle $\biguplus_{x \in M} TV_x$ in TV, known as a horizontal bundle, obeying some additional linearity conditions in the vertical variable. $\diamond$

Once one has a connection on a bundle V, one automatically can define a connection on the dual bundle $V^*$ and more generally on tensor powers $V^{\otimes k} \otimes (V^*)^{\otimes l}$, by enforcing all possible instances of the Leibniz rule, e.g.

$\nabla_X(f^{\alpha \beta}_\gamma g^{\gamma}_{\delta}) = (\nabla_X f^{\alpha \beta}_\gamma) g^\gamma_\delta + f^{\alpha \beta}_\gamma \nabla_X g^\gamma_\delta$ (9)

for all rank (2,1) tensors f and rank (1,1) tensors g. (It is a straightforward but tedious task to verify that all the Leibniz rules are consistent with each other, and that (9) and its relatives uniquely define a connection on every tensor power of V.) In particular, any connection on the tangent bundle (which is the case of importance in Riemannian geometry) naturally induces a connection on the cotangent bundle and the bundle of rank (k,l) tensors.

Here it is important to note that the indices are abstract, rather than corresponding to some frame: for instance, if $\nabla$ is a connection on the tangent bundle TM, then after choosing a frame $(e_a)_{a \in A}$, it is usually not the case that the coefficient $(\nabla_X f)^a$ of a vector field $f \in \Gamma(TM)$ at a is equal to the derivative $\nabla_X (f^a)$ of that component of f. Instead, one has a relationship of the form

$(\nabla_X f)^a = \nabla_X (f^a) + \Gamma^a_{bc} X^b f^c$ (10)

where for each $a,b,c$, the Christoffel symbol $\Gamma^a_{bc} := e^a(\nabla_{e_b} e_c)$ of the connection relative to the frame $(e_a)_{a \in A}$ is a smooth function on M. It is important to note that Christoffel symbols are not tensors, because the expression $\Gamma^a_{bc} e_a \otimes e^b \otimes e^c$ turns out to depend on the choice of frame.

Using the Leibnitz rule repeatedly, it is not hard to use (10) to give a formula for the components of derivatives of other tensors, e.g.

$(\nabla_X \omega)_a = \nabla_X (\omega_a) - \Gamma^c_{ba} X^b \omega_c$ (11)

for any 1-form $\omega$,

$(\nabla_X g)_{ab} = \nabla_X (g_{ab}) - \Gamma^c_{da} X^d g_{cb} - \Gamma^c_{db} X^d g_{ac}$ (12)

for any rank (0,2) tensor g, and so forth.

We have remarked that Christoffel symbols are not tensors. On the other hand, because $\nabla_X f$ is linear in X, we can legitimately define a tensor field $\nabla_\alpha f$, which is a section of $T^* M \otimes V \equiv \hbox{Hom}(TM, V)$, thus $\nabla_X f = X^\alpha \nabla_\alpha f$. It is also possible to express the difference of two connections as a tensor:

Exercise 2. Let $\nabla, \nabla'$ be two connections on TM. Show that there exists a unique rank (1,2) tensor $\Gamma^\alpha_{\beta \gamma} = \nabla' - \nabla$ such that

$\nabla'_\beta f^\alpha - \nabla_\beta f^\alpha = \Gamma^\alpha_{\beta \gamma} f^\gamma$ (13)

for all vector fields $f^\alpha$. Now interpret the Christoffel symbol $\Gamma^a_{bc}$ of a connection $\nabla$ on TM relative to a frame $e = (e_a)_{a \in A}$ as the difference $\nabla - \nabla^{(e)}$ of that connection with the flat connection $\nabla^{(e)}$ induced by the trivialisation of the tangent bundle induced by that frame. $\diamond$

Let $\nabla$ be a connection on TM. We say that this connection is torsion-free if we have the pleasant identity

$\nabla_\alpha \nabla_\beta f = \nabla_\beta \nabla_\alpha f$ (14)

(cf. Clairaut’s theorem) for all scalar fields $f \in C^\infty(M)$, or in other words that the Hessian $\hbox{Hess}(f)_{\alpha \beta} := \nabla_\alpha \nabla_\beta f$ of f is a symmetric rank (0,2) tensor.

Exercise 3. Show that $\nabla$ is torsion-free if and only if

${}[X,Y]^\alpha = X^\beta \nabla_\beta Y^\alpha - Y^\beta \nabla_\beta X^\alpha$ (15)

for all vector fields X, Y (or in coordinate-free notation, ${}[X,Y] = \nabla_X Y - \nabla_Y X$). $\diamond$

Remark 5. Roughly speaking, the torsion-free connections are those which have a good notion of an infinitesimal parallelogram with corners $x, x+tv+O(t^2), x+tw+O(t^2), x+tv+tw+O(t^2)$ for some infinitesimal t, such that each edge is the parallel transport of the opposing edge to error $O(t^3)$. (Without the torsion-free hypothesis, the error is merely $O(t^2)$.) $\diamond$

It would be nice if (14) extended to tensor fields f. This is true for flat connections, but false in general. The defect in (14) for such fields is measured by the curvature tensor $R \in \Gamma( \hbox{Hom}( \bigwedge^2 T M, \hbox{Hom}(TM, TM) ) )$ of the connection $\nabla$, defined by the formula

$\nabla_X \nabla_Y Z - \nabla_Y \nabla_X Z - \nabla_{[X,Y]} Z =: R(X,Y) Z$ (16)

for all vector fields X,Y,Z (cf. (4)). One easily sees that R is indeed a section of $\hbox{Hom}( \bigwedge^2 T M, \hbox{Hom}(TM, TM) )$ and can thus be viewed as a rank (1,3) tensor.

Exercise 4. If $\nabla$ is a torsion-free connection on TM, and $R_{\alpha \beta \gamma}^\delta$ is the tensor form of the curvature R, defined by requiring that

$(R(X,Y) Z)^\delta = R_{\alpha \beta \gamma}^\delta X^\alpha Y^\beta Z^\gamma$, (17)

then show that

$\nabla_\alpha \nabla_\beta X^\delta - \nabla_\beta \nabla_\alpha X^\delta = R_{\alpha \beta \gamma}^\delta X^\gamma$ (18)

for all vector fields $X^\delta$. What is the analogue of (18) if $X^\delta$ is replaced by a rank (k,l) tensor? $\diamond$

Connections describe a way to transport tensors as one moves from point to point in the manifold. There is another way to transport tensors, which is induced by diffeomorphisms $\phi: M \to M$ of the base manifold; this transportation procedure maps points $x \in M$ to points $\phi(x) \in M$, maps tangent vectors $v \in T_x M$ to tangent vectors $\phi_*(v) \in T_{\phi(x)} M$ (defined by requiring that the chain rule $\frac{d}{dt} (\gamma \circ \phi) = \phi_* (\frac{d}{dt} \gamma)$ hold for all curves $\gamma$) and then maps other tensors in the unique manner consistent with the tensor operations (e.g. $\phi_*(v \otimes w) = \phi_*(v) \otimes\phi_*(w)$). This procedure is important for describing symmetries of tensor fields (consider, for instance, what it means for the vector field $(y,-x)$ in ${\Bbb R}^2$ to be invariant under rotations around the origin). To relate this diffeomorphism transport to infinitesimal differential geometry, though, we have to look at an infinitesimal diffeomorphism, which we can view as the derivative $\frac{d}{dt} \phi_t|_{t=0}$ of a smoothly varying family $\phi_t$ of diffeomorphisms, with $\phi_0$ equal to the identity. By chasing all the definitions we see that $\frac{d}{dt} \phi_t|_{t=0}$ is just a vector field X. The infinitesimal rate of change $\frac{d}{dt} \phi_*(v)|_{t=0}$ of a tensor field v under this diffeomorphism is known as the Lie derivative ${\mathcal L}_X v$ of v with respect to the vector field X (it does not depend on any aspect of $\phi$ other than its infinitesimal vector field). On scalars f, it agrees with directional derivative

${\mathcal L}_X f = \nabla_X f$, (19)

while on vector fields Y, it agrees with the commutator:

${\mathcal L}_X Y = [X,Y]$, (20)

and its action on all other tensors can be given by the Leibniz rule (as is the case for connections). It should be emphasised, though, that the Lie derivative is not a connection, because it is not linear (over $C^\infty(M)$) in X; ${\mathcal L}_{fX} w \neq f {\mathcal L}_X w$ in general.

— Riemannian manifolds and curvature tensors —

We now specialise our attention from smooth manifolds to our main topic of interest, namely Riemannian manifolds. Informally, a Riemannian manifold is a manifold equipped with notions of length, angle, area, etc. which are infinitesimally isomorphic at every point to the corresponding notions in Euclidean space. In Euclidean space, all these geometric notions can be defined in terms of a positive definite inner product, and Riemannian manifolds are similarly founded on a positive definite Riemannian metric.

Definition 2. A Riemannian manifold (M,g) is a smooth manifold M, together with a Riemannian metric $g = g_{\alpha \beta}$ on M, i.e. a section of $\hbox{Sym}^2( T^* M )$ which is positive definite in the sense that $g(v,w) := \langle v,w \rangle_{g(x)} := g_{\alpha \beta}(x) v^\alpha w^\beta$ is a positive-definite inner product on $T_x M$ for every point x.

We now use the metric g to build several other tensors of interest. Firstly, we have the inverse metric $g^{-1} = g^{\alpha \beta}$, which is the unique rank (2,0) tensor that inverts the (0,2) tensor g in the sense that $g^{\alpha \beta} g_{\beta \gamma} = g_{\gamma \beta} g^{\beta \alpha} = \delta^{\alpha}_\gamma$ is the identity section of $\hbox{Hom}(TM,TM)$; this tensor is also symmetric and positive-definite. One can use these tensors to raise and lower the indices of other tensors; for instance, given a rank (0,2) tensor $\pi_{\alpha \beta}$, one can define the rank (1,1) tensors $\pi_\alpha^{\ \beta} = g^{\beta \gamma} \pi_{\alpha \gamma}$ and $\pi_{\ \alpha}^{\beta} = g^{\beta \gamma} \pi_{\gamma \alpha}$ and the rank (2,0) tensor $\pi^{\alpha \beta} := g^{\alpha \gamma} g^{\beta \delta} \pi_{\gamma \delta}$. We will generally only use these conventions when there is enough symmetry that there is no danger of ambiguity.

Remark 6. All Riemannian manifolds can be viewed extrinsically (locally, at least) as subsets of a Euclidean space, thanks to the famous Nash embedding theorem. But we will not need this extrinsic viewpoint in this course. $\diamond$

After the metric, the next fundamental object in Riemannian geometry is the Levi-Civita connection.

Fundamental theorem of Riemannian geometry. Let (M,g) be a Riemannian manifold. Then there exists a unique affine connection $\nabla$ (which is known as the Levi-Civita connection) which is torsion-free and respects the metric g in the sense that $\nabla g = 0$.

Exercise 5. Prove this theorem. (Hint: one can either (a) use abstract index notation and study expressions such as $\nabla_\alpha X^\beta$, (b) use coordinate-free notation and study expressions such as $g( \nabla_X Y, Z )$, or (c) use local coordinates (e.g. use a frame $e_a := \phi^* \frac{d}{dx^a}$ arising from a chart $\phi$ as in Example 2) and work with the Christoffel symbols $\Gamma^a_{bc}$. It is instructive to do this exercise in all three possible ways in order to appreciate the equivalence (and relative advantages and disadvantages) between these three perspectives. $\diamond$

Geometrically, the condition $\nabla g = 0$ asserts that parallel transport by the Levi-Civita connection is an isometry. At a computational level, it means (in conjunction with the Leibnitz rule) that covariant differentiation using the Levi-Civita connection commutes with the raising and lowering operations, for instance given a vector field $X^\alpha$ we have

$(\nabla_\alpha X)_\beta = g_{\beta \gamma} \nabla_\alpha X^\gamma = \nabla_\alpha (g_{\beta \gamma} X^\gamma) = \nabla_\alpha (X_\beta)$ (21)

and so we may safely use raising and lowering operations in the presence of Levi-Civita covariant derivatives without much risk of serious error. We can also raise and lower the covariant derivative itself, defining

$\nabla^\alpha := g^{\alpha \beta} \nabla_\beta = \nabla_\beta g^{\alpha \beta}$. (22)

This leads to the covariant Laplacian (or Bochner Laplacian)

$\Delta := \nabla_\alpha \nabla^\alpha = \nabla^\alpha \nabla_\alpha = g^{\alpha \beta} \nabla_\alpha \nabla_\beta$ (23)

defined on all tensor fields (for instance, when applied to scalar fields it becomes the trace of the Hessian, and is known as the Laplace-Beltrami operator). When applied to non-scalar fields, the covariant Laplacian differs slightly from the Hodge Laplacian (or Laplace-de Rham operator) by a lower order term which is given by the Weitzenböck identity.

As discussed earlier, all connections on TM have a curvature tensor in $\hbox{Hom}( \bigwedge^2 TM, \hbox{Hom}(TM,TM) )$. The curvature of the Levi-Civita connection is known as the Riemann curvature tensor $\hbox{Riem} = \hbox{Riem}_{\alpha \beta \gamma}^\delta$, thus

$\nabla_\alpha \nabla_\beta X^\delta - \nabla_\beta \nabla_\alpha X^\delta = \hbox{Riem}_{\alpha \beta \gamma}^\delta X^\gamma$. (24)

One also writes $\hbox{Riem}$ in co-ordinate free notation by defining $\hbox{Riem}(X,Y)Z$ for vector fields X,Y,Z by the formula

$\nabla_X \nabla_Y Z - \nabla_Y \nabla_X Z - \nabla_{[X,Y]} Z = \hbox{Riem}(X,Y) Z$ (24′)

or equivalently as $(\hbox{Riem}(X,Y)Z)^\delta = \hbox{Riem}_{\alpha \beta \gamma}^\delta X^\alpha Y^\beta Z^\gamma$.

Because $\nabla$ respects g, one eventually deduces from (24) and the Leibniz rule that $\hbox{Riem}_{\alpha \beta \gamma}^\delta$ is skew-adjoint in the $\gamma, \delta$ indices:

$\hbox{Riem}_{\alpha \beta \gamma}^\delta = - g_{\gamma \mu} g^{\delta \sigma} \hbox{Riem}_{\alpha \beta \sigma}^\mu$. (25)

It is also clearly skew-symmetric in the $\alpha,\beta$ indices. Also, from the analogue of (24) for 1-forms, i.e.

$\nabla_\alpha \nabla_\beta \omega_\delta - \nabla_\beta \nabla_\alpha \omega_\delta = -\hbox{Riem}_{\alpha \beta \delta}^\gamma \omega_\gamma$. (24”)

and the torsion-free nature of the connection, we have

$\nabla_\alpha \nabla_\beta \nabla_\delta f - \nabla_\beta \nabla_\alpha \nabla_\delta f = -\hbox{Riem}_{\alpha \beta \delta}^\gamma \nabla_\gamma f$ (26)

for all scalar fields f. Cyclically summing this in $\alpha,\beta,\delta$ we obtain the first Bianchi identity

$\hbox{Riem}_{\alpha \beta \delta}^\gamma + \hbox{Riem}_{\beta \delta \alpha}^\gamma + \hbox{Riem}_{\delta \alpha \beta}^\gamma = 0$. (27)

Exercise 6. Show that the above three symmetries of $\hbox{Riem}$ imply that $\hbox{Riem}$ is a self-adjoint section of $\hbox{Hom}( \bigwedge^2 TM, \bigwedge^2 TM )$, and that these conditions are in fact equivalent in three and fewer dimensions. (The claim fails in four and higher dimensions; see comments.) $\diamond$

Exercise 7. By differentiating (24) and cyclically summing, establish the second Bianchi identity

$\nabla_\mu \hbox{Riem}_{\alpha \beta \delta}^\gamma + \nabla_\beta \hbox{Riem}_{\mu \alpha \delta}^\gamma + \nabla_\alpha \hbox{Riem}_{\beta \mu \delta}^\gamma = 0$. $\diamond$ (28)

Exercise 8. Show that a Riemannian manifold (M,g) is locally isomorphic (as Riemannian manifolds) to Euclidean space if and only if the Riemann curvature tensor vanishes. (Hint: one direction is easy. For the other direction, the quickest way is to apply the Frobenius theorem to obtain a local trivialisation of the tangent bundle which is flat with respect to the Levi-Civita connection.) This illustrates the point that the Riemann curvature captures all the local obstructions that prevent a Riemannian manifold from being flat. (Compare this situation with the superficially similar subject of symplectic geometry, in which Darboux’s theorem guarantees that there are no local obstructions whatsoever to a symplectic manifold $(M,\omega)$ being flat.) $\diamond$

The Riemann curvature measures the “infinitesimal monodromy” of parallel transport. For our applications we will need to study a slightly different curvature, the Ricci curvature $\hbox{Ric}_{\alpha \beta}$, which measures how much the volume-radius relationship on infinitesimal sectors has been distorted from the Euclidean one. (This will not be obvious presently, as we have not yet defined the volume measure $dg$ on a Riemannian manifold.) It is defined as the trace of the Riemannian tensor, or more precisely as

$\hbox{Ric}_{\alpha \beta} := \hbox{Riem}_{\gamma \alpha \beta}^\gamma$. (29)

(One could also contract other indices than these, but due to the various symmetry properties of the Riemann tensor, one ends up with essentially the same tensor as a consequence.) We also write $\hbox{Ric}(X,Y)$ for $\hbox{Ric}_{\alpha \beta} X^\alpha Y^\beta$ when X, Y are vector fields. The symmetries of $\hbox{Riem}$ easily imply that $\hbox{Ric}$ is a symmetric rank (2,0) tensor – just like the metric g! This observation will of course be vital for defining Ricci flow later. (This observation, as well as a similar observation for the stress-energy tensor, was also decisive in leading Einstein to the equations of general relativity, but that’s a whole other story.)

We can take the trace of the Ricci tensor to form the scalar curvature

$R := g^{\alpha \beta} \hbox{Ric}_{\alpha \beta} = g^{\alpha \beta} \hbox{Riem}_{\gamma \alpha \beta}^{\gamma}$; (30)

up to normalisations, R can also be viewed as the trace of the Riemann tensor (viewed as a section of $\hbox{Hom}(\bigwedge^2 TM, \bigwedge^2 TM)$). The scalar curvature measures how the relationship of volume of infinitesimal balls to their radius is distorted by the geometry.

The relationship between the Riemannian, Ricci, and scalar curvatures depends on the dimension:

1. In one dimension, all three curvatures vanish; there are no degrees of freedom.
2. In two dimensions, the Riemannian and Ricci curvatures are just multiples of the scalar curvature (by some tensor depending algebraically on the metric); there is only one degree of freedom.
3. In three dimensions, the Riemann tensor is a linear combination of the Ricci curvature (see also Exercise 9 below). On the other hand, the scalar curvature does not control Ricci (or Riemann); the Ricci tensor contains an additional trace-free component. (However, once we start evolving by Ricci flow, we shall see that the Hamilton-Ivey pinching phenomenon will allow us to use the scalar curvature to mostly control Ricci and hence Riemann near singularities.)
4. In four and higher dimensions, the Riemann tensor is not fully controlled by the Ricci curvature; there is an additional component to the Riemann tensor, namely the Weyl tensor. Similarly, the Ricci curvature is not fully controlled by the scalar curvature.

Exercise 9. (Ricci controls Riemann in three dimensions) In three dimensions, suppose that the (necessarily real) eigenvalues of the Riemann curvature at a point x (viewed as an element of $\hbox{Hom}(\bigwedge^2 TM, \bigwedge^2 TM)$) are $\lambda, \mu, \nu$. Show that the eigenvalues of the Ricci curvature at x (viewed as an element of $\hbox{Hom}( TM, TM )$ are $\lambda+\mu, \mu+\nu, \nu+\lambda$. Conclude in particular that

$\|\hbox{Riem}\|_g = O( \|\hbox{Ric}\|_g )$ (31)

where we endow the (fibres of the) spaces $\hbox{Hom}(\bigwedge^2 TM, \bigwedge^2 TM)$ and $\hbox{Hom}( TM, TM )$ with the Hilbert (or Hilbert-Schmidt) structure induced by the metric g. $\diamond$

Remark 7. The fact that Ricci controls Riemann in three dimensions, without itself degenerating into scalar curvature or zero, seems to explain why Ricci flow is especially powerful in three dimensions; it is still useful, but harder to work with, in two dimensions, useless in one dimension, and too weak to fully control the geometry in four and higher dimensions. It seems to me that the special nature of three dimensions stems from the fact that it is the unique number of dimensions in which 2-forms (which are naturally associated with curvature) are Hodge dual to vector fields (as opposed to scalars, or to higher-rank tensors); this is the same special feature of three dimensions which gives us the cross product (as opposed to the more general wedge product). $\diamond$

Because of the variety of curvatures, there are various notions of what it means for a manifold to have “non-negative curvature” at some point.

Definition 3. Let x be a point on a Riemannian manifold (M,g). We say that x has

1. non-negative scalar curvature if $R(x) \geq 0$;
2. non-negative Ricci curvature if $\hbox{Ric}(x) \geq 0$ as a quadratic form on TM, i.e. $\hbox{Ric}_{\alpha\beta}(x) v^\alpha v^\beta \geq 0$ for all vectors $v \in T_x M$;
3. non-negative sectional curvature if $g( \hbox{Riem}(x)(X,Y) X, Y)(x) = \hbox{Riem}_{\alpha\beta \gamma}^\delta(x) X_\alpha Y_\beta X^\gamma Y_\delta \geq 0$ for all vectors $X, Y \in T_x M$;
4. non-negative Riemann curvature if $\hbox{Riem}(x) \geq 0$ as a quadratic form on $\bigwedge^2 TM$, thus $\hbox{Riem}_{\alpha \beta \gamma}^\delta(x) \omega^{\alpha \beta}(x) \omega^{\gamma}_{\ \delta}(x) \geq 0$ for all two-forms $\omega$.

It is not hard to show that, in arbitrary dimension, 4. implies 3. implies 2. implies 1. In one dimension, these conditions are vacuously true; in two dimensions; these conditions are all equivalent; and in three dimensions, non-negative Riemann curvature is equivalent to non-negative sectional curvature (because every 2-form is the wedge product of two one-forms in this case) but these conditions are otherwise distinct. In four and higher dimensions all of these conditions are distinct. One can also define the analogous notions of positive curvature (or negative curvature, or non-positive curvature) in the usual manner.

[Geometrically, positive scalar curvature means that infinitesimal balls have slightly less volume than in the Euclidean case; positive Ricci curvature means that infinitesimal sectors have slightly less volume than in the Euclidean case; and positive sectional curvature means that all infinitesimally geodesic two-dimensional surfaces have positive gaussian curvature. I don’t know of a geometrically simple way to describe positive Riemann curvature.]

A couple lectures from now, we shall compute these curvatures explicitly in a number of model cases (such as that of a homogeneous space). For now, we give a “cartoon” or “schematic” description of these curvatures when viewed in some local coordinate system $\phi$, using the associated frame $e_a := \phi^* \frac{d}{dx^a}$ as in Example 2 to express all tensors as arrays of numbers. Writing $g_{ab} = O(g)$, we thus schematically have the following relationships:

1. The Christoffel symbols $\Gamma^a_{bc}$ are schematically of the form $O( g^{-1} \partial g )$. Thus a covariant derivative $\nabla_a w$ of a tensor w looks schematically like $O( \partial w + g^{-1} (\partial g) w )$, and the Laplacian $\Delta w$ looks like $O( g^{-1} \partial^2 w + g^{-2} (\partial g) \partial w + g^{-2} (\partial^2 g) w + g^{-3} (\partial g)^2 w)$.
2. The Riemann curvature tensor $\hbox{Riem}_{abc}^d$ and the Ricci curvature tensor $\hbox{Ric}_{ab}$ schematically take the form $O( g^{-1} \partial^2 g + g^{-2} (\partial g)^2 )$.
3. The scalar curvature R schematically takes the form $O( g^{-2} \partial^2 g + g^{-3} (\partial g)^2 )$. (Thus the scalar curvature has the same scaling as the Laplacian.)

Remark 8. Note how in all of these expressions, the “number of derivatives” and “number of g’s” stays fixed among all terms in a given expression. This can be viewed as an example of dimensional analysis in action, and is useful for catching errors in manipulations with these sorts of expressions. From a more representation-theoretic viewpoint, what is going on is that all of the above expressions have constant weight with respect to the joint (commuting) actions of the dilation operation $x^i \mapsto \lambda x^i$ on the underlying coordinate chart (which essentially controls the number of derivatives $\partial$ that appear) and the homogeneity operation $g \mapsto cg$ (which, naturally enough, controls the number of g’s that appear). $\diamond$

[Update, Mar 27: more remarks added; various corrections.]