on the zeroes of a time-dependent family of polynomials , with a particular focus on the case when the polynomials had real zeroes. Here (inspired by some discussions I had during a recent conference on the Riemann hypothesis in Bristol) we record the analogous theory in which the polynomials instead have zeroes on a circle , with the heat flow slightly adjusted to compensate for this. As we shall discuss shortly, a key example of this situation arises when is the numerator of the zeta function of a curve.

More precisely, let be a natural number. We will say that a polynomial

of degree (so that ) obeys the *functional equation* if the are all real and

for all , thus

and

for all non-zero . This means that the zeroes of (counting multiplicity) lie in and are symmetric with respect to complex conjugation and inversion across the circle . We say that this polynomial *obeys the Riemann hypothesis* if all of its zeroes actually lie on the circle . For instance, in the case, the polynomial obeys the Riemann hypothesis if and only if .

Such polynomials arise in number theory as follows: if is a projective curve of genus over a finite field , then, as famously proven by Weil, the associated local zeta function (as defined for instance in this previous blog post) is known to take the form

where is a degree polynomial obeying both the functional equation and the Riemann hypothesis. In the case that is an elliptic curve, then and takes the form , where is the number of -points of minus . The Riemann hypothesis in this case is a famous result of Hasse.

Another key example of such polynomials arise from rescaled characteristic polynomials

of matrices in the compact symplectic group . These polynomials obey both the functional equation and the Riemann hypothesis. The Sato-Tate conjecture (in higher genus) asserts, roughly speaking, that “typical” polyomials arising from the number theoretic situation above are distributed like the rescaled characteristic polynomials (1), where is drawn uniformly from with Haar measure.

Given a polynomial of degree with coefficients

we can evolve it in time by the formula

thus for . Informally, as one increases , this evolution accentuates the effect of the extreme monomials, particularly, and at the expense of the intermediate monomials such as , and conversely as one decreases . This family of polynomials obeys the heat-type equation

In view of the results of Marcus, Spielman, and Srivastava, it is also very likely that one can interpret this flow in terms of expected characteristic polynomials involving conjugation over the compact symplectic group , and should also be tied to some sort of “” version of Brownian motion on this group, but we have not attempted to work this connection out in detail.

It is clear that if obeys the functional equation, then so does for any other time . Now we investigate the evolution of the zeroes. Suppose at some time that the zeroes of are distinct, then

From the inverse function theorem we see that for times sufficiently close to , the zeroes of continue to be distinct (and vary smoothly in ), with

Differentiating this at any not equal to any of the , we obtain

and

and

Inserting these formulae into (2) (expanding as ) and canceling some terms, we conclude that

for sufficiently close to , and not equal to . Extracting the residue at , we conclude that

which we can rearrange as

If we make the change of variables (noting that one can make depend smoothly on for sufficiently close to ), this becomes

Intuitively, this equation asserts that the phases repel each other if they are real (and attract each other if their difference is imaginary). If obeys the Riemann hypothesis, then the are all real at time , then the Picard uniqueness theorem (applied to and its complex conjugate) then shows that the are also real for sufficiently close to . If we then define the entropy functional

then the above equation becomes a gradient flow

which implies in particular that is non-increasing in time. This shows that as one evolves time forward from , there is a uniform lower bound on the separation between the phases , and hence the equation can be solved indefinitely; in particular, obeys the Riemann hypothesis for all if it does so at time . Our argument here assumed that the zeroes of were simple, but this assumption can be removed by the usual limiting argument.

For any polynomial obeying the functional equation, the rescaled polynomials converge locally uniformly to as . By Rouche’s theorem, we conclude that the zeroes of converge to the equally spaced points on the circle . Together with the symmetry properties of the zeroes, this implies in particular that obeys the Riemann hypothesis for all sufficiently large positive . In the opposite direction, when , the polynomials converge locally uniformly to , so if , of the zeroes converge to the origin and the other converge to infinity. In particular, fails the Riemann hypothesis for sufficiently large negative . Thus (if ), there must exist a real number , which we call the *de Bruijn-Newman constant* of the original polynomial , such that obeys the Riemann hypothesis for and fails the Riemann hypothesis for . The situation is a bit more complicated if vanishes; if is the first natural number such that (or equivalently, ) does not vanish, then by the above arguments one finds in the limit that of the zeroes go to the origin, go to infinity, and the remaining zeroes converge to the equally spaced points . In this case the de Bruijn-Newman constant remains finite except in the degenerate case , in which case .

For instance, consider the case when and for some real with . Then the quadratic polynomial

has zeroes

and one easily checks that these zeroes lie on the circle when , and are on the real axis otherwise. Thus in this case we have (with if ). Note how as increases to , the zeroes repel each other and eventually converge to , while as decreases to , the zeroes collide and then separate on the real axis, with one zero going to the origin and the other to infinity.

The arguments in my paper with Brad Rodgers (discussed in this previous post) indicate that for a “typical” polynomial of degree that obeys the Riemann hypothesis, the expected time to relaxation to equilibrium (in which the zeroes are equally spaced) should be comparable to , basically because the average spacing is and hence by (3) the typical velocity of the zeroes should be comparable to , and the diameter of the unit circle is comparable to , thus requiring time comparable to to reach equilibrium. Taking contrapositives, this suggests that the de Bruijn-Newman constant should typically take on values comparable to (since typically one would not expect the initial configuration of zeroes to be close to evenly spaced). I have not attempted to formalise or prove this claim, but presumably one could do some numerics (perhaps using some of the examples of given previously) to explore this further.

]]>for some , then one has the lower bound

In the other direction, for any , there are examples of operators obeying (1) such that

In this paper we improve the upper bound to come closer to the lower bound:

Theorem 1For any , and any infinite-dimensional , there exist operators obeying (1) such that

One can probably improve the exponent somewhat by a modification of the methods, though it does not seem likely that one can lower it all the way to without a substantially new idea. Nevertheless I believe it plausible that the lower bound (2) is close to optimal.

We now sketch the methods of proof. The construction giving (3) proceeded by first identifying with the algebra of matrices that have entries in . It is then possible to find two matrices whose commutator takes the form

for some bounded operator (for instance one can take to be an isometry). If one then conjugates by the diagonal operator , one can eusure that (1) and (3) both hold.

It is natural to adapt this strategy to matrices rather than matrices, where is a parameter at one’s disposal. If one can find matrices that are almost upper triangular (in that only the entries on or above the lower diagonal are non-zero), whose commutator only differs from the identity in the top right corner, thus

for some , then by conjugating by a diagonal matrix such as for some and optimising in , one can improve the bound in (3) to ; if the bounds in the implied constant in the are polynomial in , one can then optimise in to obtain a bound of the form (4) (perhaps with the exponent replaced by a different constant).

The task is then to find almost upper triangular matrices whose commutator takes the required form. The lower diagonals of must then commute; it took me a while to realise then that one could (usually) conjugate one of the matrices, say by a suitable diagonal matrix, so that the lower diagonal consisted entirely of the identity operator, which would make the other lower diagonal consist of a single operator, say . After a lot of further lengthy experimentation, I eventually realised that one could conjugate further by unipotent upper triangular matrices so that all remaining entries other than those on the far right column vanished. Thus, without too much loss of generality, one can assume that takes the normal form

for some , solving the system of equations

It turns out to be possible to solve this system of equations by a contraction mapping argument if one takes to be a “Hilbert’s hotel” pair of isometries as in the previous post, though the contraction is very slight, leading to polynomial losses in in the implied constant.

There is a further question raised in Popa’s paper which I was unable to resolve. As a special case of one of the main theorems (Theorem 2.1) of that paper, the following result was shown: if obeys the bounds

(where denotes the space of all operators of the form with and compact), then there exist operators with such that . (In fact, Popa’s result covers a more general situation in which one is working in a properly infinite algebra with non-trivial centre.) We sketch a proof of this result as follows. Suppose that and for some . A standard greedy algorithm argument (see this paper of Brown and Pearcy) allows one to find orthonormal vectors for such that for each , one has for some comparable to , and some orthogonal to all of the . After some conjugation (and a suitable identification of with , one can thus place in a normal form

where is a isometry with infinite deficiency, and have norm . Setting , it then suffices to solve the commutator equation

with ; note the similarity with (3).

By the usual Hilbert’s hotel construction, one can complement with another isometry obeying the “Hilbert’s hotel” identity

and also , . Proceeding as in the previous post, we can try the ansatz

for some operators , leading to the system of equations

Using the first equation to solve for , the second to then solve for , and the third to then solve for , one can obtain matrices with the required properties.

Thus far, my attempts to extend this construction to larger matrices with good bounds on have been unsuccessful. A model problem would be to express

as a commutator with significantly smaller than . The construction in my paper achieves something like this, but with replaced by a more complicated operator. One would also need variants of this result in which one is allowed to perturb the above operator by an arbitrary finite rank operator of bounded operator norm.

]]>In these notes, random variables will be denoted in boldface.

Definition 1A real random variable is said to be normally distributed with mean and variance if one hasfor all test functions . Similarly, a complex random variable is said to be normally distributed with mean and variance if one has

for all test functions , where is the area element on .

A

real Brownian motionwith base point is a random, almost surely continuous function (using the locally uniform topology on continuous functions) with the property that (almost surely) , and for any sequence of times , the increments for are independent real random variables that are normally distributed with mean zero and variance . Similarly, acomplex Brownian motionwith base point is a random, almost surely continuous function with the property that and for any sequence of times , the increments for are independent complex random variables that are normally distributed with mean zero and variance .

Remark 2Thanks to the central limit theorem, the hypothesis that the increments be normally distributed can be dropped from the definition of a Brownian motion, so long as one retains the independence and the normalisation of the mean and variance (technically one also needs some uniform integrability on the increments beyond the second moment, but we will not detail this here). A similar statement is also true for the complex Brownian motion (where now we need to normalise the variances and covariances of the real and imaginary parts of the increments).

Real and complex Brownian motions exist from any base point or ; see e.g. this previous blog post for a construction. We have the following simple invariances:

Exercise 3

- (i) (Translation invariance) If is a real Brownian motion with base point , and , show that is a real Brownian motion with base point . Similarly, if is a complex Brownian motion with base point , and , show that is a complex Brownian motion with base point .
- (ii) (Dilation invariance) If is a real Brownian motion with base point , and is non-zero, show that is also a real Brownian motion with base point . Similarly, if is a complex Brownian motion with base point , and is non-zero, show that is also a complex Brownian motion with base point .
- (iii) (Real and imaginary parts) If is a complex Brownian motion with base point , show that and are independent real Brownian motions with base point . Conversely, if are independent real Brownian motions of base point , show that is a complex Brownian motion with base point .

The next lemma is a special case of the optional stopping theorem.

Lemma 4 (Optional stopping identities)

- (i) (Real case) Let be a real Brownian motion with base point . Let be a bounded stopping time – a bounded random variable with the property that for any time , the event that is determined by the values of the trajectory for times up to (or more precisely, this event is measurable with respect to the algebra generated by this proprtion of the trajectory). Then
and

and

- (ii) (Complex case) Let be a real Brownian motion with base point . Let be a bounded stopping time – a bounded random variable with the property that for any time , the event that is determined by the values of the trajectory for times up to . Then

*Proof:* (Slightly informal) We just prove (i) and leave (ii) as an exercise. By translation invariance we can take . Let be an upper bound for . Since is a real normally distributed variable with mean zero and variance , we have

and

and

By the law of total expectation, we thus have

and

and

where the inner conditional expectations are with respect to the event that attains a particular point in . However, from the independent increment nature of Brownian motion, once one conditions to a fixed point , the random variable becomes a real normally distributed variable with mean and variance . Thus we have

and

and

which give the first two claims, and (after some algebra) the identity

which then also gives the third claim.

Exercise 5Prove the second part of Lemma 4.

** — 1. Conformal invariance of Brownian motion — **

Let be an open subset of , and a point in . We can define the *complex Brownian motion with base point restricted to * to be the restriction of a complex Brownian motion with base point to the first time in which the Brownian motion exits (or if no such time exists). We have a fundamental conformal invariance theorem of Lévy:

Theorem 6 (Lévy’s theorem on conformal invariance of Brownian motion)Let be a conformal map between two open subsets of , and let be a complex Brownian motion with base point restricted to . Define a rescaling byNote that this is almost surely a continuous strictly monotone increasing function. Set (so that is a homeomorphism from to ), and let be the function defined by the formula

Then is a complex Brownian motion with base point restricted to .

Note that this significantly generalises the translation and dilation invariance of complex Brownian motion.

*Proof:* (Somewhat informal – to do things properly one should first set up Ito calculus) To avoid technicalities we will assume that is bounded above and below on , so that the map is uniformly bilipschitz; the general case can be obtained from this case by a limiting argument that is not detailed here. With this assumption, we see that almost surely extends continuously to the endpoint time if this time is finite. Once one conditions on the value of and up to this time , we then extend this motion further (if ) by declaring for to be a complex Brownian motion with base point , translated in time by . Now is defined on all of , and it will suffice to show that this is a complex Brownian motion based at . The basing is clear, so it suffices to show for all times , the random variable is normally distributed with mean and variance .

Let be a test function. It will suffice to show that

If we define the field

for and , with , then it will suffice to prove the more general claim

for all and (with the convention that is just Brownian motion based at if lies outside of ), where

As is well known, is smooth on and solves the backwards heat equation

on this domain. The strategy will be to show that also solves this equation.

Let and . If then clearly . If instead and , then is a Brownian motion and then we have . Now suppose that be small enough that , where is an upper bound for on . Let be the first time such that either or

Then if we let be the quantity

then and . Let us now condition on a specific value of , and on the trajectory up to time . Then the (conditional) distribution of is that of , and hence the conditional expectation is . By the law of total expectation, we conclude the identity

Next, we obtain the analogous estimate

From Taylor expansion we have

Taking expectations and applying Lemma 4, (2) and Hölder’s inequality (which can interpolate between the bounds and to conclude ), we obtain the desired claim (3). Subtracting, we now have

The expression in the expectation vanishes unless , hence by the triangle inequality

Iterating this using the fact that vanishes at , and sending to zero (noting that the cumulative error term will go to zero since ), we conclude that for all , giving the claim.

One can use Lévy’s theorem (or variants of this theorem) to prove various results in complex analysis rather efficiently. As a quick example, we sketch a Brownian motion-based proof of Liouville’s theorem (omitting some technical steps). Suppose for contradiction that we have a nonconstant bounded entire function . If is a complex Brownian motion based at , then a variant of Levy’s theorem can be used to show that the image is a time parameterisation of Brownian motion. But it is easy to show that Brownian motion is almost surely unbounded, so the image cannot be bounded.

If is an open subset of whose complement contains an arc, then one can show that for any , the complex Brownian motion based at will hit the boundary of in a finite time . The location where this motion first hits the boundary is then a random variable in ; the law of this variable is called the *harmonic measure* of with base point , and we will denote it by ; it is a probability measure on . The reason for the terminology “harmonic measure” comes from the following:

Theorem 7Let be a bounded open subset of , and let be a harmonic (or holomorphic) function that extends continuously to . Then for any , one has the representation formula

*Proof:* (Informal) For simplicity let us assume that extends smoothly to some open neighbourhood of . Let be the motion that is equal to up to time , and then is constant at for all later times. A variant of the Taylor expansion argument used to prove Lévy’s theorem shows that

for any , which on iterating and sending to zero implies that is independent of time. Since this quantity converges to as and to as , the claim follows.

This theorem can also extend to unbounded domains provided that does not grow too fast at infinity (for instance if is bounded, basically thanks to the neighbourhood recurrent properties of complex Brownian motion); we do not give a precise statement here. Among other things, this theorem gives an immediate proof of the maximum principle for harmonic functions, since if on the boundary then from the triangle inequality one has for all . It also gives an alternate route to Liouville’s theorem: if is entire and bounded, then applying the maximum principle to the complement of a small disk we see that for all distinct .

When the boundary is sufficiently nice (e.g. analytic), the harmonic measure becomes absolutely continuous with respect to one-dimensional Lebesgue measure; however, we will not pay too much attention to these sorts of regularity issues in this set of notes.

From Levy’s theorem on the conformal invariance of Brownian motion we deduce the conformal invariance of harmonic measure, thus for any conformal map that extends continuously to the boundaries and any , the harmonic measure of with base point is the pushforward of the harmonic measure of with base point , thus

for any continuous compactly supported test function , and also

for any (Borel) measurable .

- (i) If and , show that the measure on the unit circle is given by
where is arclength measure. In particular, when , then is the uniform measure on the unit circle.

- (ii) If and , show that the measure on the real line is given by
(For this exercise one can assume that harmonic measure is well defined for unbounded domains, and that the representation formula (4) continues to hold for bounded harmonic or holomorphic functions.)

Exercise 9 (Brownian motion description of conformal mapping)Let be the region enclosed by a Jordan curve , and let be three distinct points on in anticlockwise order. Let be three distinct points on the boundary of the unit disk , again traversed in anticlockwise order. Let be the conformal map that takes to for (the existence and uniqueness of this map follows from the Riemann mapping theorem). Let , and for , let be the probability that the terminal point of Brownian motion at with base point lies in the arc between and (here we use the fact that the endpoints are hit with probability zero, or in other words that the harmonic measure is continuous; see Exercise 15 below). Thus are non-negative and sum to . Let be the complex numbers , , . Show the crossratio identityIn principle, this allows one to describe conformal maps purely in terms of Brownian motion.

We remark that the link between Brownian motion and conformal mapping can help gain an intuitive understanding of the Carathéodory kernel theorem (Theorem 12 from Notes 3). Consider for instance the example in Exercise 13 from those notes. It is intuitively clear that a Brownian motion based at the origin will very rarely pass through the slit beween and , instead hitting the right side of the boundary of first. As such, the harmonic measure of the left side of the bounadry should be very small, and in fact one can use this to show that the preimage under of the region to the left of the boundary goes to zero in diameter as , which helps explain why the limiting function does not map to this region at all.

Exercise 10 (Brownian motion description of conformal radius)

- (i) Let and with . Show that the probability that the Brownian motion hits the circle before it hits is equal to . (
Hint:is harmonic away from the origin.)- (ii) Let be a simply connected proper subset of , let be a point in , and let be the conformal radius of around . Show that for small , the probability that a Brownian motion based at a point with will hit the circle before it hits the boundary is equal to , where denotes a quantity that goes to zero as .

Exercise 11Let be a connected subset of , let be a Brownian motion based at the origin, and let be the first time this motion exits . Show that the probability that hits is at least for some absolute constant . (Hint:one can control the event that makes a “loop” around a point in at radius less than , which is enough to force intersection with , at least if one works some distance away from the boundary of the disk.)

We now sketch the proof of a basic Brownian motion estimate that is useful in applications. We begin with a lemma that says, roughly speaking, that “folding” a set reduces the probability of it being hit by Brownian motion.

Lemma 12Let , and let be a closed subset of the unit disk . Write and , and write (i.e. reflected onto the upper half-plane). Let be a complex Brownian motion based at , and let be the first time this motion hits the boundary of . Then

*Proof:* (Informal) To illustrate the argument at a heuristic level, let us make the (almost surely false) assumption that the Brownian motion only crosses the real axis at a finite set of times before hitting the disk. Then the Brownian motion would split into subcurves for , with the convention that . Each subcurve would lie in either the upper half-plane or the lower half-plane, with equal probability of each; furthermore, one could arbitrarily apply complex conjugation to one or more of these subcurves and still obtain a motion with the same law. Observe that if one conditions on the Brownian motion up to time , and the subcurve has a probability of hitting when it lies in the upper half-plane, and a probability of hitting when it lies in the lower half-plane, then it will have a probability of at most of hitting when it lies in the upper half-plane, and probability of hitting when it lies in the lower half-plane; thus the probability of this subcurve hitting is less than or equal to that of it hitting . In principle, the lemma now follows from repeatedly applying the law of total expectation.

This naive argument does not quite work because a Brownian motion starting at a real number will in fact almost surely cross the real axis an infinite number of times. However it is possible to adapt this argument by redefining the so that after each time , the Brownian motion is forced to move some small distance before one starts looking for the next time it hits the real axis. See the proof of Lemma 6.1 of these notes of Lawler for a complete proof along these lines.

This gives an inequality similar in spirit to the Grötzsch modulus estimate from Notes 2:

Corollary 13 (Beurling projection theorem)Let , and let be a compact connected subset the annulus that intersects both boundary circles of the annulus. Let be a complex Brownian motion based at , and let be the first time this motion hits the outer boundary of the annulus. Then the probability that intersects is greater than or equal to the probability that intersects the interval .

*Proof:* (Sketch) One can use the above lemma to fold around the real axis without increasing the probability of being hit by Brownian motion. By rotation, one can similarly fold around any other line through the origin. By repeatedly folding in this fashion to reduce its angular variation, one can eventually replace with a set that lies inside the sector for any . However, by the monotone convergence theorem, the probability that intersects this sector converges to the probability that it intersects in the limit , and the claim follows.

Exercise 14With the notation as the above corollary, show that the probability that intersects the interval is . (Hint:apply a square root conformal map to the disk with removed, and then compare with the half-plane harmonic measure from Exercise 8(ii).)

The following consequence of the above estimate, giving a sort of Hölder regularity of Brownian measure, is particularly useful in applications.

Exercise 15 (Beurling estimate)Let be an open set not containing , with the property that the connected component of containing intersects the unit circle . Let be such that . Then for any , one has ; that is to say, the probability that a Brownian motion based at exits at a point within from the origin is . (Hint:one can use conformal mapping to show that the probability appearing at the end of Corollary 13 is .) Conclude in particular that harmonic measures are always continuous (they assign zero to any point).

Exercise 16Let be a region bounded by a Jordan curve, let , let be the Brownian motion based at , and let be the first time this motion exits . Then for any , show that the probability that the curve has diameter at least is at most .

Exercise 17Let be a conformal map with , and let be a curve with and for . Show that(Hint: use Exercise 11.)

** — 2. Half-plane capacity — **

One can use Brownian motion to construct other close relatives of harmonic measure, such Green’s functions, excursion measures. See for instance these lecture notes of Lawler for more details. We will focus on one such use of Brownian motion, to interpret the concept of *half-plane capacity*; this is a notion that is particularly well adapted to the study of chordal Loewner equations (it plays a role analogous to that of conformal radius for the radial Loewner equation).

Let be the upper half-plane. A subset of the upper half-plane is said to be a *compact hull* if it is bounded, closed in , and the complement is simply connected. By the Riemann mapping theorem, for any compact hull , there is a unique conformal map which is normalised at infinity in the sense that

for some complex numbers . The quantity is particularly important and will be called the *half-plane capacity* of and denoted .

In general, we have the following Brownian motion characterisation of half-plane capacity:

Proposition 19Let be a compact hull, with conformal map and half-plane capacity .

- (i) If is complex Brownian motion based at some point , and is the first time this motion exits , then
- (ii) We have

*Proof:* (Sketch) Part (i) follows from applying Theorem 7 to the bounded harmonic function . Part (ii) follows from part (i) by setting for a large , rearranging, and sending using (5).

Among other things, this proposition demonstrates that for all , and that the half-plane capacity is always non-negative (in fact it is not hard to show from the above proposition that it is strictly positive as long as is non-empty).

If are two compact hulls with , then will map conformally to the complement of in . Thus is also a convex hull, and by the uniqueness of Riemann maps we have the identity

which on comparing Laurent expansions leads to the further identity

In particular we have the monotonicity , with equality if and only if . One may verify that these claims are consistent with Exercise 18.

Exercise 20 (Submodularity of half-plane capacity)Let be two compact hulls.

- (i) If , show that
(

Hint:use Proposition 19, and consider how the times in which a Brownian motion exits , , , and are related.)- (ii) Show that

Exercise 21Let be a compact hull bounded in a disk . For any , show thatas , where is complex Brownian motion based at and is the first time it exits . Similarly, for any , show that

This formula gives a Brownian motion interpretation for on the portion of the boundary of . It can be used to give useful quantitative estimates for in this region; see Section 3.4 of Lawler’s book.

** — 3. The chordal Loewner equation — **

We now develop (in a rather informal fashion) the theory of the chordal Loewner equation, which roughly speaking is to conformal maps from the upper half-plane to the complement of complex hulls as the radial Loewner equation is to conformal maps from the unit disk to subsets of the complex plane. A more rigorous treatment can be found in Lawler’s book.

Suppose one has a simple curve such that and . There are important and delicate issues regarding the regularity hypotheses on this curve (which become particularly important in SLE, when the regularity is quite limited), but for this informal discussion we will ignore all of these issues.

For each time , the set forms a compact hull, and so has some half-plane capacity . From the monotonicity of capacity, this half-plane capacity is increasing in . It is traditional to normalise the curve so that

this is analogous to normalising the Loewner chains from Notes 3 to have conformal radius at time . A basic example of such normalised curves would be the curves for some fixed , since the normalisation follows from (6).

Let be the conformal maps associated to these compact hulls. From (8) we will have

for any and , where is the conformal map associated to the compact hull . From (9) this hull has half-plane capacity , thus we have the Laurent expansion

It can be shown (using the Beurling estimate) that extends continuously to the tip of the curve , and attains a real value at that point; furthermore, depends continuously on . See Lemma 4.2 of Lawler’s book. As such, should be a short arc (of length ) starting at . If , it is possible to use a quantitative version of Exercise 21 (again using the Beurling estimate) to obtain an estimate basically of the form

for any fixed . If is non-zero, we instead have

For instance, if , then for all , and from Exercise 18 we have the exact formula

Inserting (12) into (11) and using the chain rule, we obtain

and we then arrive at the *(chordal) Loewner equation*

for all and . This equation can be justified rigorously for any simple curve : see Proposition 4.4 of Lawler’s book. Note that the imaginary part of is negative, which is consistent with the observation made previously that the imaginary part of is decreasing in .

We have started with a chain of compact hulls associated to a simple curve, and shown that the resulting conformal maps obey the Loewner equation for some continuous driving term . Conversely, suppose one is given a continuous driving term . It follows from Picard existence and uniqueness theorem that for each there is a unique maximal time of existence such that the ODE (13) with initial data can be solved for time , one can show that for each time , is a conformal map from to with the Laurent expansion

hence the complement are an increasing sequence of compact hulls with half-plane capacity . Proving complex differentiability of can be done from first principles, and the Laurent expansion near infinity is also not hard; the main difficulty is to show that the map is surjective, which requires solving (13) backwards in time (and here one can do this indefinitely as now one is moving away from the real axis instead of towards it). See Theorem 4.6 of Lawler’s book for details (in fact a more general theorem is proven, in which the single point is replaced by a probability measure, analogously to how the radial Loewner equation uses Herglotz functions instead of a single driving function when not restricted to slit domains). However, there is a subtlety, in that the hulls are not necessarily the image of simple curves . This is often the case for short times if the driving function does not oscillate too wildly, but it can happen that the curve that one would expect to trace out eventually intersects itself, in which case the region it then encloses must be absorbed into the hull (cf. the “pinching off” phenomenon in the Carathéodory kernel theorem). Nevertheless, it is still possible to have Loewner chains that are “generated” by non-simple paths , in the sense that consists of the unbounded connected component of the complement .

There are some symmetries of the transform from the to the . If one translates by a constant, , then the resulting domains are also translated, , and . Slightly less trivially, for any , if one performs a rescaled dilation , then one can check using (13) that , and the corresponding conformal maps are given by . On the other hand, just performing a scalar multiple on the driving force can transform the behavior of dramatically; the transform from to is very definitely not linear!

** — 4. Schramm-Loewner evolution — **

In the previous section, we have indicated that every continuous driving function gives rise to a family of conformal maps obeying the Loewner equation (13). The (chordal) Schramm-Loewner evolution () with parameter is the special case in which the driving function takes the form for some real Brownian motion based at the origin. Thus is now a random conformal map from a random domain , defined by solving the Schramm-Loewner equation

with initial condition for , and with defined as the set of all for which the above ODE can be solved up to time taking values in . The parameter cannot be scaled away by simple renormalisations such as scaling, and in fact the behaviour of is rather sensitive to the value of , with special behaviour or significance at various values such as playing particularly special roles; there is also a duality relationship between and which we will not discuss here.

The case is rather boring, in which is deterministic, and is just with the line segment between and removed. The cases are substantially more interesting. It is a non-trivial theorem (particularly at the special value ) that is almost surely generated by some random path ; see Theorem 6.3 of Lawler’s book. The nature of this path is sensitive to the choice of parameter :

- For , the path is almost surely simple and goes to infinity as ; it also avoids the real line (except at time ).
- For ; it also has non-trivial intersection with the real line.
- For , the path is almost surely space-filling (which of course also implies that ), and also hits every point on .

See Section 6.2 of Lawler’s book. The path becomes increasingly fractal as increases: it is a result of Rohde and Schramm and Beffara that the image almost surely has Hausdorff dimension .

We have asserted that defines a random path in that starts at the origin and generally “wanders off” to infinity (though for it keeps recurring back to bounded sets infinitely often). By the Riemann mapping theorem, we can now extend this to other domains. Let be a simply connected open proper subset of whose boundary we will assume for simplicity to be a Jordan curve (this hypothesis can be relaxed). Let be two distinct points on the boundary . By the Riemann mapping theorem and Carathéodory’s theorem (Theorem 20 from Notes 2), there is a conformal map whose continuous extension maps and to and respectively; this map is unique up to rescalings for . One can then define the Schramm-Loewner evolution on from to to be the family of conformal maps for , where is the usual Schramm-Loewner evolution with parameter . The Schramm-Loewner evolution on is well defined up to a time reparameterisation . The Markovian and stationary nature of Brownian motion translates to an analogous Markovian and conformally invariant property of . Roughly speaking, it is the following: if is any reasonable domain with two boundary points , is on this domain from to with associated path , and is any time, then after conditioning on the path up to time , the remainder of the path has the same image as the path on the domain from to . Conversely, under suitable regularity hypotheses, the processes are the *only* random path processes on domains with this property (much as Brownian motion is the only Markovian stationary process, once one normalises the mean and variance). As a consequence, whenever one now a random path process that is known or suspected to enjoy some conformal invariance properties, it has become natural to conjecture that it obeys the law of (though in some cases it is more natural to work with other flavours of SLE than the chordal SLE discussed here, such as radial SLE or whole-plane SLE). For instance, in the pioneering work of Schramm, this line of reasoning was used to conjecture that the loop-erased random walk in a domain has the law of (radial) ; this conjecture was then established by Lawler, Schramm, and Werner. Many further processes have since been either proven or conjectured to be linked to one of the SLE processes, such as the limiting law of a uniform spanning tree (proven to be ), interfaces of the Ising model (proven to be ), or the scaling limit of self-avoiding random walks (conjectured to be ). Further discussion of these topics is beyond the scope of this course, and we refer the interested reader to Lawler’s book for more details.

We have now tentatively improved the upper bound of the de Bruijn-Newman constant to . Among the technical improvements in our approach, we now are able to use Taylor expansions to efficiently compute the approximation to for many values of in a given region, thus speeding up the computations in the barrier considerably. Also, by using the heuristic that behaves somewhat like the partial Euler product , we were able to find a good location to place the barrier in which is larger than average, hence easier to keep away from zero.

The main remaining bottleneck is that of computing the Euler mollifier bounds that keep bounded away from zero for larger values of beyond the barrier. In going below we are beginning to need quite complicated mollifiers with somewhat poor tail behavior; we may be reaching the point where none of our bounds will succeed in keeping bounded away from zero, so we may be close to the natural limits of our methods.

Participants are also welcome to add any further summaries of the situation in the comments below.

]]>Clearly, a univalent function on the unit disk is a conformal map from to the image ; in particular, is simply connected, and not all of (since otherwise the inverse map would violate Liouville’s theorem). In the converse direction, the Riemann mapping theorem tells us that every open simply connected proper subset of the complex numbers is the image of a univalent function on . Furthermore, if contains the origin, then the univalent function with this image becomes unique once we normalise and . Thus the Riemann mapping theorem provides a one-to-one correspondence between open simply connected proper subsets of the complex plane containing the origin, and univalent functions with and . We will focus particular attention on the univalent functions with the normalisation and ; such functions will be called schlicht functions.

One basic example of a univalent function on is the Cayley transform , which is a Möbius transformation from to the right half-plane . (The slight variant is also referred to as the Cayley transform, as is the closely related map , which maps to the upper half-plane.) One can square this map to obtain a further univalent function , which now maps to the complex numbers with the negative real axis removed. One can normalise this function to be schlicht to obtain the Koebe function

which now maps to the complex numbers with the half-line removed. A little more generally, for any we have the *rotated Koebe function*

that is a schlicht function that maps to the complex numbers with the half-line removed.

Every schlicht function has a convergent Taylor expansion

for some complex coefficients with . For instance, the Koebe function has the expansion

and similarly the rotated Koebe function has the expansion

Intuitively, the Koebe function and its rotations should be the “largest” schlicht functions available. This is formalised by the famous Bieberbach conjecture, which asserts that for any schlicht function, the coefficients should obey the bound for all . After a large number of partial results, this conjecture was eventually solved by de Branges; see for instance this survey of Korevaar or this survey of Koepf for a history.

It turns out that to resolve these sorts of questions, it is convenient to restrict attention to schlicht functions that are *odd*, thus for all , and the Taylor expansion now reads

for some complex coefficients with . One can transform a general schlicht function to an odd schlicht function by observing that the function , after removing the singularity at zero, is a non-zero function that equals at the origin, and thus (as is simply connected) has a unique holomorphic square root that also equals at the origin. If one then sets

it is not difficult to verify that is an odd schlicht function which additionally obeys the equation

Conversely, given an odd schlicht function , the formula (4) uniquely determines a schlicht function .

For instance, if is the Koebe function (1), becomes

which maps to the complex numbers with two slits removed, and if is the rotated Koebe function (2), becomes

De Branges established the Bieberbach conjecture by first proving an analogous conjecture for odd schlicht functions known as Robertson’s conjecture. More precisely, we have

Theorem 1 (de Branges’ theorem)Let be a natural number.

- (i) (Robertson conjecture) If is an odd schlicht function, then
- (ii) (Bieberbach conjecture) If is a schlicht function, then

It is easy to see that the Robertson conjecture for a given value of implies the Bieberbach conjecture for the same value of . Indeed, if is schlicht, and is the odd schlicht function given by (3), then from extracting the coefficient of (4) we obtain a formula

for the coefficients of in terms of the coefficients of . Applying the Cauchy-Schwarz inequality, we derive the Bieberbach conjecture for this value of from the Robertson conjecture for the same value of . We remark that Littlewood and Paley had conjectured a stronger form of Robertson’s conjecture, but this was disproved for by Fekete and Szegö.

To prove the Robertson and Bieberbach conjectures, one first takes a logarithm and deduces both conjectures from a similar conjecture about the Taylor coefficients of , known as the *Milin conjecture*. Next, one continuously enlarges the image of the schlicht function to cover all of ; done properly, this places the schlicht function as the initial function in a sequence of univalent maps known as a Loewner chain. The functions obey a useful differential equation known as the Loewner equation, that involves an unspecified forcing term (or , in the case that the image is a slit domain) coming from the boundary; this in turn gives useful differential equations for the Taylor coefficients of , , or . After some elementary calculus manipulations to “integrate” this equations, the Bieberbach, Robertson, and Milin conjectures are then reduced to establishing the non-negativity of a certain explicit hypergeometric function, which is non-trivial to prove (and will not be done here, except for small values of ) but for which several proofs exist in the literature.

The theory of Loewner chains subsequently became fundamental to a more recent topic in complex analysis, that of the Schramm-Loewner equation (SLE), which is the focus of the next and final set of notes.

** — 1. The area theorem and its consequences — **

We begin with the area theorem of Grönwall.

Theorem 2 (Grönwall area theorem)Let be a univalent function with a convergent Laurent expansionThen

*Proof:* By shifting we may normalise . By hypothesis we have for any ; by replacing with and using a limiting argument, we may assume without loss of generality that the have some exponential decay as (in order to justify some of the manipulations below).

Let be a large parameter. If , then and . The area enclosed by the simple curve is equal to

crucially, the error term here goes to zero as . Meanwhile, by the change of variables formula (using monotone convergence if desired to work in compact subsets of the annulus initially) and Plancherel's theorem, the area of the region is

Comparing these bounds we conclude that

sending to infinity, we obtain the claim.

Exercise 3Let be a univalent function with Taylor expansionShow that the area of is equal to . (In particular, has finite area if and only if .)

Corollary 4 (Bieberbach inequality)

- (i) If is an odd schlicht function, then .
- (ii) If is a schlicht function, then .

*Proof:* For (i), we apply Theorem 2 to the univalent function defined by , which has a Laurent expansion , to give the claim. For (ii), apply part (i) to the square root of with first term .

Exercise 5Show that equality occurs in Corollary 4(i) if and only if takes the form for some , and in Corollary 4(ii) if and only if takes the form of a rotated Koebe function for some .

The Bieberbach inequality can be rescaled to bound the second coefficient of univalent functions:

Exercise 6 (Rescaled Bieberbach inequality)If is a univalent function, show thatWhen does equality hold?

The Bieberbach inequality gives a useful lower bound for the image of a univalent function, known as the Koebe quarter theorem:

Corollary 7 (Koebe quarter theorem)Let be a univalent function. Then contains the disk .

*Proof:* By applying a translation and rescaling, we may assume without loss of generality that is a schlicht function, with Taylor expansion

Our task is now to show that for every , the equation has a solution in . If this were not the case, then the function is invertible on , with inverse being univalent and having the Taylor expansion

Applying Exercise 6 we then have

while from the Bieberbach inequality one also has . Hence by the triangle inequality , which is incompatible with the hypothesis .

Exercise 8Show that the radius is best possible in Corollary 7 (thus, does not contain any disk with ) if and only if takes the form for some complex numbers and real .

Remark 9The univalence hypothesis is crucial in the Koebe quarter theorem. Consider for instance the functions defined by . These are locally univalent functions (since is holomorphic with non-zero derivative) and , , but avoids the point .

Exercise 10 (Koebe distortion theorem)Let be a schlicht function, and let have magnitude .

- (i) Show that
(

Hint:compose on the right with a Möbius automorphism of that sends to and then apply the rescaled Bieberbach inequality.)- (ii) Show that
(

Hint:use (i) to control the radial derivative of .)- (iii) Show that
- (iv) Show that
(This cannot be directly derived from (ii) and (iii). Instead, compose on the right with a Mobius automorphism that sends to and to , rescale it to be schlicht, and apply (iii) to this function at .)

- (v) Show that the space of schlicht functions is a normal family. In other words, if is any sequence of schlicht functions, then there is a subsequence that converges locally uniformly on compact sets.
- (vi) (Qualitative Bieberbach conjecture) Show that for each natural number there is a constant such that whenever is a schlicht function with Taylor expansion

Exercise 11 (Conformal radius)If is a non-empty simply connected open subset of that is not all of , and is a point in , define theconformal radiusof at to be the quantity , where is any conformal map from to that maps to (the existence and uniqueness of this radius follows from the Riemann mapping theorem). Thus for instance the disk has conformal radius around .

- (i) Show that the conformal radius is strictly monotone in : if are non-empty simply connected open subsets of , and , then the conformal radius of around is strictly greater than that of .
- (ii) Show that the conformal radius of a disk around an element of the disk is given by the formula .
- (iii) Show that the conformal radius of around lies between and , where is the radius of the maximal disk that is contained in .
- (iv) If the conformal radius of around is equal to , show that for all sufficiently small , the ring domain has modulus , where denotes a quantity that goes to zero as , and the modulus of a ring domain was defined in Notes 2.

We can use the distortion theorem to obtain a nice criterion for when univalent maps converge to a given limit, known as the Carathéodory kernel theorem.

Theorem 12 (Carathéodory kernel theorem)Let be a sequence of simply connected open proper subsets of containing the origin, and let be a further simply connected open proper subset of containing . Let and be the conformal maps with and (the existence and uniqueness of these maps are given by the Riemann mapping theorem). Then the following are equivalent:

- (i) converges locally uniformly on compact sets to .
- (ii) For every subsequence of the , is the set of all such that there is an open connected set containing and that is contained in for all sufficiently large .

If conclusion (ii) holds, is known as the *kernel* of the domains .

*Proof:* Suppose first that converges locally uniformly on compact sets to . If , then for some . If , then the holomorphic functions converge uniformly on to the function , which is not identically zero but has a zero in . By Hurwitz’s theorem we conclude that also has a zero in for all sufficiently large ; indeed the same argument shows that one can replace by any element of a small neighbourhood of to obtain the same conclusion, uniformly in . From compactness we conclude that for sufficiently large , has a zero in for all , thus for sufficiently large . Since is open connected and contains and , we see that is contained in the set described in (ii).

Conversely, suppose that is a subsequence of the and is such that there is an open connected set containing and that is contained in for sufficiently large . The inverse maps are holomorphic and bounded, hence form a normal family by Montel’s theorem. By refining the subsequence we may thus assume that the converge locally uniformly to a holomorphic limit . The function takes values in , but by the open mapping theorem it must in fact map to . In particular, . Since converges to , and converges locally uniformly to , we conclude that converges to , thus and hence . This establishes the derivation of (ii) from (i).

Now suppose that (ii) holds. It suffices to show that every subsequence of has a further subsequence that converges locally uniformly on compact sets to (this is an instance of the *Urysohn subsequence principle*). Then (as contains ) in particular there is a disk that is contained in the for all sufficiently large ; on the other hand, as is not all of , there is also a disk which is *not* contained in the for all sufficiently large . By Exercise 11, this implies that the conformal radii of the around zero is bounded above and below, thus is bounded above and below.

By Exercise 10(v), and rescaling, the functions then form a normal family, thus there is a subsequence of the that converges locally uniformly on compact sets to some limit . Since is positive and bounded away from zero, is also positive, so is non-constant. By Hurwitz’s theorem, is therefore also univalent, and thus maps to some region . By the implication of (ii) from (i) (with replaced by ) we conclude that is the set of all such that there is an open connected set containing and that is contained in for all sufficiently large ; but by hypothesis, this set is also . Thus , and then by the uniqueness part of the Riemann mapping theorem, as desired.

The condition in Theorem 12(ii) indicates that “converges” to in a rather complicated sense, in which large parts of are allowed to be “pinched off” from and disappear in the limit. This is illustrated in the following explicit example:

Exercise 13 (Explicit example of kernel convergence)Let be the function from (5), thus is a univalent function from to with the two vertical rays from to , and from to , removed. For any natural number , let and let , and define the transformed functions .

- (i) Show that is a univalent function from to with the two vertical rays from to , and from to , removed, and that and .
- (ii) Show that converges locally uniformly to the function , and that this latter map is a univalent map from to the half-plane . (
Hint:one does not need to compute everything exactly; for instance, any terms of the form can be written using the notation instead of expanded explicitly.)- (iii) Explain why these facts are consistent with the Carathéodory kernel theorem.

As another illustration of the theorem, let be two distinct convex open proper subsets of containing the origin, and let be the associated conformal maps from to respectively with and . Then the alternating sequence does not converge locally uniformly to any limit. The set is the set of all points that lie in a connected open set containing the origin that eventually is contained in the sequence ; but if one passes to the subsequence , this set of points enlarges to , and so the sequence does not in fact have a kernel.

However, the kernel theorem simplifies significantly when the are monotone increasing, which is already an important special case:

Corollary 14 (Monotone increasing case of kernel theorem)Let the notation and assumptions be as in Theorem 12. Assume furthermore thatand that . Then converges locally uniformly on compact sets to .

Loewner observed that the kernel theorem can be used to approximate univalent functions by functions mapping into slit domains. More precisely, define a *slit domain* to be an open simply connected subset of formed by deleting a half-infinite Jordan curve connecting some finite point to infinity; for instance, the image of the Koebe function is a slit domain.

Theorem 15 (Loewner approximation theorem)Let be a univalent function. Then there exists a sequence of univalent functions whose images are slit domains, and which converge locally uniformly on compact subsets to .

*Proof:* First suppose that extends to a univalent function on a slightly larger disk for some . Then the image of the unit circle is a Jordan curve enclosing the region in the interior. Applying the Jordan curve theorem (and the Möbius inversion ), one can find a half-infinite Jordan curve from to infinity that stays outside of . For any , one can concatenate this curve with the arc to obtain another half-infinite Jordan curve , whose complement is a slit domain which has as kernel (why?). If we let be the conformal maps from to with and , we conclude from the Carathéodory kernel theorem that converges locally uniformly on compact sets to .

If is just univalent on , then it is the locally uniform limit of the dilations , which are univalent on the slightly larger disks . By the previous arguments, each is in turn the locally uniform limit of univalent functions whose images are slit domains, and the claim now follows from a diagonalisation argument.

** — 2. Loewner chains — **

The material in this section is based on these lecture notes of Contreras.

An important tool in analysing univalent functions is to study one-parameter families of univalent functions, parameterised by a time parameter , in which the images are increasing in ; roughly speaking, these families allow one to study an arbitrary univalent function by “integrating” along such a family from back to . Traditionally, we normalise these families into (radial) Loewner chains, which we now define:

Definition 16 (Loewner chain)A (radial) Loewner chain is a family of univalent maps with and (so in particular is schlicht), such that for all . (In these notes we use the prime notation exclusively for differentiation in the variable; we will use later for differentiation in the variable.)

A key example of a Loewner chain is the family

of dilated Koebe functions; note that the image of each is the slit domain , which is clearly monotone increasing in . More generally, we have the rotated Koebe chains

Whenever one has a family of simply connected proper open subsets of containing with for , and . By definition, is then the conformal radius of around , which is a strictly increasing function of by Exercise 11. If this conformal radius is equal to at and increases continuously to infinity as , then one can reparameterise the variable so that , at which point one obtains a Loewner chain.

From the Koebe quarter theorem we see that each image in a Loewner chain contains the disk . In particular the increase to fill out all of : .

Let be a Loewner chain, Let . The relation is sometimes expressed as the assertion that is *subordinate* to . It has the consequence that one has a composition law of the form

for a univalent function , uniquely defined as , noting taht is well-defined on . By construction, we have and

as well as the composition laws

for . We will refer to the as *transition functions*.

From the Schwarz lemma, we have

for , with strict inequality when . In particular, if we introduce the function

for and , then (after removing the singularity at infinity and using (10)) we see that is a holomorphic map to the right half-plane , normalised so that

Define a Herglotz function to be a holomorphic function , thus is a Herglotz function for all . A key family of examples of a Herglotz function are the Möbius transforms for . In fact, all other Herglotz functions are basically just averages of this one:

Exercise 17 (Herglotz representation theorem)Let be a Herglotz function, normalised so that .

- (i) For any , show that
for . (

Hint:The real part of is harmonic, and so has a Poisson kernel representation. Alternatively, one can use a Taylor expansion of .)- (ii) Show that there exists a (Radon) probability measure on such that
for all . (One will need a measure-theoretic tool such as Prokhorov’s theorem, the Riesz representation theorem, or the Helly selection principle.) Conversely, show that every probability measure on generates a Herglotz function with by the above formula.

- (iii) Show that the measure constructed on (ii) is unique.

This has a useful corollary, namely a version of the Harnack inequality:

Exercise 18 (Harnack inequality)Let be a Herglotz function, normalised so that . Show thatfor all .

This gives some useful Lipschitz regularity properties of the transition functions and univalent functions in the variable:

Lemma 19 (Lipschitz regularity)Let be a compact subset of , and let . Use to denote a quantity bounded in magnitude by , where depends only on .

- (i) For any and , one has
- (ii) For any and , one has

One can make the bounds much more explicit if desired (see e.g. Lemma 2.3 of these notes of Contreras), but for our purposes any Lipschitz bound will suffice.

*Proof:* To prove (i), it suffices from (11) and the Schwarz-Pick lemma (Exercise 13 from Notes 2) to establish this claim when . We can also assume that since the claim is trivial when . From the Harnack inequality one has

for , which by (12) and some computation gives

Now we prove (ii). We may assume without loss of generality that is convex. From Exercise 10 (normalising to be schlicht) we see that for , and hence has a Lipschitz constant of on . Since , the claim now follows from (13).

As a first application of this we show that every schlicht function starts a Loewner chain.

Lemma 20Let be schlicht. Then there exists a Loewner chain with .

*Proof:* This will be similar to the proof of Theorem 15. First suppose that extends to be univalent on for some , then is a Jordan curve. Then by Carathéodory’s theorem (Theorem 20 of Notes 2) (and the Möbius inversion ) one can find a conformal map from the exterior of to the exterior of that sends infinity to infinity. If we define for to be the region enclosed by the Jordan curve , then the are increasing in with conformal radius going to infinity as . If one sets to be the conformal maps with and , then (by the uniqueness of Riemann mapping) and by the Carathéodory kernel theorem, converges locally uniformly to as . In particular, the conformal radii are continuous in . Reparameterising in one can then obtain the required Loewner chain.

Now suppose is only univalent of . As in the proof of Theorem 15, one can express as the locally uniform limit of schlicht functions , each of which extends univalently to some larger disk . By the preceding discussion, each of the extends to a Loewner chain . From the Lipschitz bounds (and the Koebe distortion theorem) one sees that these chains are locally uniformly equicontinuous in and , uniformly in , and hence by Arzela-Ascoli we can pass to a subsequence that converges locally uniformly in to a limit ; one can also assume that the transition functions converge locally uniformly to limits . It is then not difficult by Hurwitz theorem to verify the limiting relations (9), (11), and that is a Loewner chain with as desired.

Suppose that are close to each other: . Then one heuristically has the approximations

and hence by (12) and some rearranging

and hence on applying , (9), and the Newton approximation

This suggests that the should obey the *Loewner equation*

for some Herglotz function . This is essentially the case:

Theorem 21 (Loewner equation)Let be a Loewner chain. Then, for outside of an exceptional set of Lebesgue measure zero, the functions are differentiable in time for each , and obey the equation (14) for all and , and some Herglotz function for each with . Furthermore, the maps are measurable for every .

*Proof:* Let be a countable dense subset of . From Lemma 19, the function is Lipschitz continuous, and thus differentiable almost everywhere, for each . Thus there exists a Lebesgue measure zero set such that is differentiable in outside of for each . From the Koebe distortion theorem is also locally Lipschitz (hence locally uniformly equicontinuous) in the variable, so in fact is differentiable in outside of for all . Without loss of generality we may assume contains zero.

Let , and let . Then as approaches from below, we have

uniformly; from (9) and Newton approximation we thus have

which implies that

Also we have

and hence by (12)

Taking limits, we see that the function is Herglotz with , giving the claim. It is also easy to verify the measurability (because derivatives of Lipschitz functions are measurable)

Example 22The Loewner chain (7) solves the Loewner equation with the Herglotz function . With the rotated Koebe chains (8), we instead have .

Although we will not need it in this set of notes, there is also a converse implication that for every family of Herglotz functions depending measurably on , one can associate a Loewner chain.

Let us now Taylor expand a Loewner chain at each time as

as , we have . As is differentiable in almost every for each , and is locally uniformly continuous in , we see from the Cauchy integral formulae that the are also differentiable almost everywhere in . If we similarly write

for all outside of , then , and we obtain the equations

and so forth. For instance, for the Loewner chain (7) one can verify that and for solve these equations. For (8) one instead has and .

We have the following bounds on the first few coefficients of :

Exercise 23Let be a Herglotz function with . Let be the measure coming from the Herglotz representation theorem.

- (i) Show that for all . In particular, for all . Use this to give an alternate proof of the upper bound in the Harnack inequality.
- (ii) Show that .

We can use this to establish the first two cases of the Bieberbach conjecture:

Theorem 24 ( cases of Bieberbach)If is schlicht, then and .

The bound is not new, and indeed was implicitly used many times in the above arguments, but we include it to illustrate the use of the equations (15), (16).

*Proof:* By Lemma 20, we can write (and ) for some Loewner chain .

We can write (15) as . On the other hand, from the Koebe distortion theorem applied to the schlicht functions , we have , so in particular goes to zero at infinity. We can integrate from to infinity to obtain

From Harnack’s inequality we have , giving the required bound .

In a similar vein, writing (16) as

we obtain

As , we may integrate from to infinity to obtain the identity

Taking real parts using Exercise 23(ii) and (17), we have

Since , we thus have

where . By Cauchy-Schwarz, we have , and from the bound , we thus have

Replacing by the schlicht function (which rotates by ) and optimising in , we obtain the claim .

Exercise 25Show that equality in the above bound is only attained when is a rotated Koebe function.

The Loewner equation (14) takes a special form in the case of slit domains. Indeed, let be a slit domain not containing the origin, with conformal radius around , and let be the Loewner chain with . We can parameterise so that the sets have conformal radius around for every , in which case we see that must be the unique conformal map from to with and . For instance, for the chain (7) we would have .

Theorem 26 (Loewner equation for slit domains)In the above situation, we have the Loewner equation holding with

*Proof:* Let be a time where the Loewner equation holds. For , the function extends continuously to the boundary, and is two-to-one on the split , except at the tip where there is a single preimage on the unit circle; this can be seen by taking a holomorphic square root of , using a Möbius transformation to map the resulting image to a set bounded by a Jordan curve, and applying Carathéodory's theorem (Theorem 20 from Notes 2) to the resulting conformal map. The image is then with a Jordan arc removed, where is a point on the boundary of the sphere. Applying Carathéodory’s theorem to a holomorphic square root of , we see that extends continuously to be a map from to , with an arc on the boundary mapping (in two-to-one fashion) to the arc , and the endpoints of this arc mapping to . From this and (12), we see that converges to zero outside of the arc , which by the Herglotz representation theorem implies that the measure associated to is supported on the arc . An inspection of the proof of Carathéodory’s theorem also reveals that the are equicontinuous on as , and thus converge uniformly to (which is the identity function) as . This implies that must converge to the point as approaches , and so converges vaguely to the Dirac mass at . Since converges locally uniformly to , we conclude the formula (18). As depends measurably in , we conclude that does also.

In fact one can show that extends to a continuous function , and that the Loewner equation holds for all , but this is a bit trickier to show (it requires some further distortion estimates on conformal maps, related to the arguments used to prove Carathéodory’s theorem in the previous notes) and will not be done here. One can think of the function as “driving force” that incrementally enlarges the slit via the Loewner equation; this perspective is often used when studying the Schramm-Loewner evolution, which is the topic of the next (and final) set of notes.

** — 3. The Bieberbach conjecture — **

We now turn to the resolution of the Bieberbach (and Robertson) conjectures. We follow the simplified treatment of de Branges’ original proof, due to FitzGerald and Pommerenke, though we omit the proof of one key ingredient, namely the non-negativity of a certain hypergeometric function.

The first step is to work not with the Taylor coefficients of a schlicht function or with an odd schlicht function , but rather with the (normalised) logarithm of a schlicht function , as the coefficients end up obeying more tractable equations. To transfer to this setting we need the following elementary inequalities relating the coefficients of a power series with the coefficients of its exponential.

Lemma 27 (Second Lebedev-Milin inequality)Let be a formal power series with complex coefficients and no constant term, and let be its formal exponential, thus

*Proof:* If we formally differentiate (19) in , we obtain the identity

extracting the coefficient for any , we obtain the formula

By Cauchy-Schwarz, we thus have

Using and telescoping series, it thus suffices to prove the identity

But this follows from observing that

and that

for all .

Exercise 28Show that equality holds in (20) for a given if and only if there is such that for all .

Exercise 29 (First Lebedev-Milin inequality)With the notation as in the above lemma, and under the additional assumption , prove that(

Hint:using the Cauchy-Schwarz inequality as above, first show that the power series is bounded term-by-term by the power series of .) When does equality occur?

Exercise 30 (Third Lebedev-Milin inequality)With the notation as in the above lemma, show that(

Hint:use the second Lebedev-Milin inequality and (21), together with the calculus inequality for all .) When does equality occur?

Using these inequalities, one can reduce the Robertson and Bieberbach conjectures to the following conjecture of Milin, also proven by de Branges:

Theorem 31 (Milin conjecture)Let be a schlicht function. Let be the branch of the logarithm of that equals at the origin, thus one hasfor some complex coefficients . Then one has

for all .

Indeed, if

is an odd schlicht function, let be the schlicht function given by (4), then

Applying Lemma 27 with , we obtain the Robertson conjecture, and the Bieberbach conjecture follows.

Example 32If is the Koebe function (1), thenso in this case and . Similarly, for the rotated Koebe function (2) one has and again . If one works instead with the dilated Koebe function , we have , thus the time parameter only affects the constant term in . This is already a hint that the coefficients of could be worth studying further in this problem.

To prove the Milin conjecture, we use the Loewner chain method. It suffices by Theorem 15 and a limiting argument to do so in the case that is a slit domain. Then, by Theorem 26, is the initial function of a Loewner chain that solves the Loewner equation

for all and almost every , and some function .

We can transform this into an equation for . Indeed, for non-zero we may divide by to obtain

(for any local branch of the logarithm) and hence

Since , is equal to at the origin (for an appropriate branch of the logarithm). Thus we can write

The are locally Lipschitz in (basically thanks to Lemma 19) and for almost every we have the Taylor expansions

and

Comparing coefficients, we arrive at the system of ordinary differential equations

Fix (we will not need to use any induction on here). We would like to use the system (22) to show that

The most naive attempt to do this would be to show that one has a monotonicity formula

for all , and that the expression goes to zero as , as the claim would then follow from the fundamental theorem of calculus. This turns out to not quite work; however it turns out that a slight modification of this idea does work. Namely, we introduce the quantities

where for each , is a continuously differentiable function to be chosen later. If we have the initial condition

for all , then the Milin conjecture is equivalent to asking that . On the other hand, if we impose a boundary condition

for , then we also have as , since is schlicht and hence is a normal family, implying that the are bounded in for each . Thus, to solve the Milin, Robertson, and Bieberbach conjectures, it suffices to find a choice of weights obeying the initial and boundary conditions (23), (24), and such that

for almost every (note that will be Lipschitz, so the fundamental theorem of calculus applies).

Let us now try to establish (25) using (22). We first write , and drop the explicit dependence on , thus

for . To simplify this equation, we make a further transformation, introducing the functions

(with the convention ); then we can write the above equation as

We can recover the from the by the formula

It may be worth recalling at this point that in the example of the rotated Koebe Loewner chain (2) one has , , and , for some real constant . Observe that has a simpler form than in this example, suggesting again that the decision to transform the problem to one about the rather than the is on the right track.

We now calculate

Conveniently, the unknown function no longer appears explicitly! Some simple algebra shows that

and hence by summation by parts

with the convention .

In the example of the rotated Koebe function, with , the factors and both vanish, which is consistent with the fact that vanishes in this case regardless of the choice of weights . So these two factors look to be related to each other. On the other hand, for more general choices of , these two expressions do not have any definite sign. For comparison, the quantity also vanishes when , and has a definite sign. So it is natural to see of these three factors are related to each other. After a little bit of experimentation, one eventually discovers the following elementary identity giving such a connection:

Inserting this identity into the above equation, we obtain

which can be rearranged as

We can kill the first summation by fiat, by imposing the requirement that the obey the system of differential equations

Hence if we also have the non-negativity condition

for all and , we will have obtained the desired monotonicity (25).

To summarise, in order to prove the Milin conjecture for a fixed value of , we need to find functions obeying the initial condition (23), the boundary condition (24), the differential equation (26), and the nonnegativity condition (27), with the convention . This is a significant reduction to the problem, as one just has to write down an explicit formula for such functions and verify all the properties.

Let us work out some simple cases. First consider the case . Now our task is to solve the system

for all . This is easy: we just take (indeed this is the unique choice). This gives the case of the Milin conjecture (which corresponds to the case of Bieberbach).

Next consider the case . The system is now

Again, a routine computation shows that there is a unique solution here, namely and . This gives the case of the Milin conjecture (which corresponds to the case of Bieberbach). One should compare this argument to that in Theorem 24, in particular one should see very similar weight functions emerging.

Let us now move on to . The system is now

A slightly lengthier calculation gives the unique explicit solution

to the above conditions.

These simple cases already indicate that there is basically only one candidate for the weights that will work. A calculation can give the explicit formula:

Exercise 33Let .

- (i) Show there is a unique choice of continuously differentiable functions that solve the differential equations (26) with initial condition (23), with the convention . (Use the Picard existence theorem.)
- (ii) For any , show that the expression
is equal to when is even and when is odd.

- (iii) Show that the functions
for obey the properties (23), (26), (24). (

Hint:for (23), first use (ii) to show that is equal to when is even and when is odd, then use (26).)

The Bieberbach conjecture is then reduced to the claim that

for any and . This inequality can be directly verified for any fixed ; for general it follows from general inequalities on Jacobi polynomials by Askey and Gasper, with an alternate proof given subsequently by Gasper. A further proof of (28), based on a variant of the above argument due to Weinstein that avoids explicit use of (28), appears in this article of Koepf. We will not detail these arguments here.

]]>Significant progress has been made since the last update; by implementing the “barrier” method to establish zero free regions for by leveraging the extensive existing numerical verification of the Riemann hypothesis (which establishes zero free regions for ), we have been able to improve our upper bound on from 0.48 to 0.28. Furthermore, there appears to be a bit of further room to improve the bounds further by tweaking the parameters used in the argument (we are currently using ); the most recent idea is to try to use exponential sum estimates to improve the bounds on the derivative of the approximation to that is used in the barrier method, which currently is the most computationally intensive step of the argument.

]]>In previous quarters, we proved a fundamental theorem about this concept, the Riemann mapping theorem:

Theorem 1 (Riemann mapping theorem)Let be a simply connected open subset of that is not all of . Then is conformally equivalent to the unit disk .

This theorem was proven in these 246A lecture notes, using an argument of Koebe. At a very high level, one can sketch Koebe’s proof of the Riemann mapping theorem as follows: among all the injective holomorphic maps from to that map some fixed point to , pick one that maximises the magnitude of the derivative (ignoring for this discussion the issue of proving that a maximiser exists). If avoids some point in , one can compose with various holomorphic maps and use Schwarz’s lemma and the chain rule to increase without destroying injectivity; see the previous lecture notes for details. The conformal map is unique up to Möbius automorphisms of the disk; one can fix the map by picking two distinct points in , and requiring to be zero and to be positive real.

It is a beautiful observation of Thurston that the concept of a conformal mapping has a discrete counterpart, namely the mapping of one circle packing to another. Furthermore, one can run a version of Koebe’s argument (using now a discrete version of Perron’s method) to prove the Riemann mapping theorem through circle packings. In principle, this leads to a mostly elementary approach to conformal geometry, based on extremely classical mathematics that goes all the way back to Apollonius. However, in order to *prove* the basic existence and uniqueness theorems of circle packing, as well as the convergence to conformal maps in the continuous limit, it seems to be necessary (or at least highly convenient) to use much more modern machinery, including the theory of quasiconformal mapping, and also the Riemann mapping theorem itself (so in particular we are not structuring these notes to provide a completely independent proof of that theorem, though this may well be possible).

To make the above discussion more precise we need some notation.

Definition 2 (Circle packing)A (finite)circle packingis a finite collection of circles in the complex numbers indexed by some finite set , whose interiors are all disjoint (but which are allowed to be tangent to each other), and whose union is connected. Thenerveof a circle packing is the finite graph whose vertices are the centres of the circle packing, with two such centres connected by an edge if the circles are tangent. (In these notes all graphs are undirected, finite and simple, unless otherwise specified.)

It is clear that the nerve of a circle packing is connected and planar, since one can draw the nerve by placing each vertex (tautologically) in its location in the complex plane, and drawing each edge by the line segment between the centres of the circles it connects (this line segment will pass through the point of tangency of the two circles). Later in these notes we will also have to consider some infinite circle packings, most notably the infinite regular hexagonal circle packing.

The first basic theorem in the subject is the following converse statement:

Theorem 3 (Circle packing theorem)Every connected planar graph is the nerve of a circle packing.

Of course, there can be multiple circle packings associated to a given connected planar graph; indeed, since reflections across a line and Möbius transformations map circles to circles (or lines), they will map circle packings to circle packings (unless one or more of the circles is sent to a line). It turns out that once one adds enough edges to the planar graph, the circle packing is otherwise rigid:

Theorem 4 (Koebe-Andreev-Thurston theorem)If a connected planar graph is maximal (i.e., no further edge can be added to it without destroying planarity), then the circle packing given by the above theorem is unique up to reflections and Möbius transformations.

Exercise 5Let be a connected planar graph with vertices. Show that the following are equivalent:

- (i) is a maximal planar graph.
- (ii) has edges.
- (iii) Every drawing of divides the plane into faces that have three edges each. (This includes one unbounded face.)
- (iv) At least one drawing of divides the plane into faces that have three edges each.
(

Hint:use Euler’s formula , where is the number of faces including the unbounded face.)

Thurston conjectured that circle packings can be used to approximate the conformal map arising in the Riemann mapping theorem. Here is an informal statement:

Conjecture 6 (Informal Thurston conjecture)Let be a simply connected domain, with two distinct points . Let be the conformal map from to that maps to the origin and to a positive real. For any small , let be the portion of the regular hexagonal circle packing by circles of radius that are contained in , and let be an circle packing of with all “boundary circles” tangent to , giving rise to an “approximate map” defined on the subset of consisting of the circles of , their interiors, and the interstitial regions between triples of mutually tangent circles. Normalise this map so that is zero and is a positive real. Then converges to as .

A rigorous version of this conjecture was proven by Rodin and Sullivan. Besides some elementary geometric lemmas (regarding the relative sizes of various configurations of tangent circles), the main ingredients are a rigidity result for the regular hexagonal circle packing, and the theory of quasiconformal maps. Quasiconformal maps are what seem on the surface to be a very broad generalisation of the notion of a conformal map. Informally, conformal maps take infinitesimal circles to infinitesimal circles, whereas quasiconformal maps take infinitesimal circles to infinitesimal ellipses of bounded eccentricity. In terms of Wirtinger derivatives, conformal maps obey the Cauchy-Riemann equation , while (sufficiently smooth) quasiconformal maps only obey an inequality . As such, quasiconformal maps are considerably more plentiful than conformal maps, and in particular it is possible to create piecewise smooth quasiconformal maps by gluing together various simple maps such as affine maps or Möbius transformations; such piecewise maps will naturally arise when trying to rigorously build the map alluded to in the above conjecture. On the other hand, it turns out that quasiconformal maps still have many vestiges of the rigidity properties enjoyed by conformal maps; for instance, there are quasiconformal analogues of fundamental theorems in conformal mapping such as the Schwarz reflection principle, Liouville’s theorem, or Hurwitz’s theorem. Among other things, these quasiconformal rigidity theorems allow one to create conformal maps from the limit of quasiconformal maps in many circumstances, and this will be how the Thurston conjecture will be proven. A key technical tool in establishing these sorts of rigidity theorems will be the theory of an important quasiconformal (quasi-)invariant, the *conformal modulus* (or, equivalently, the extremal length, which is the reciprocal of the modulus).

** — 1. Proof of the circle packing theorem — **

We loosely follow the treatment of Beardon and Stephenson. It is slightly more convenient to temporarily work in the Riemann sphere rather than the complex plane , in order to more easily use Möbius transformations. (Later we will make another change of venue, working in the Poincaré disk instead of the Riemann sphere.)

Define a *Riemann sphere circle* to be either a circle in or a line in together with , together with one of the two components of the complement of this circle or line designated as the “interior”. In the case of a line, this “interior” is just one of the two half-planes on either side of the line; in the case of the circle, this is either the usual interior or the usual exterior plus the point at infinity; in the last case, we refer to the Riemann sphere circle as an *exterior circle*. (One could also equivalently work with an orientation on the circle rather than assigning an interior, since the interior could then be described as the region to (say) the left of the circle as one traverses the circle along the indicated orientation.) Note that Möbius transforms map Riemann sphere circles to Riemann sphere circles. If one views the Riemann sphere as a geometric sphere in Euclidean space , then Riemann sphere circles are just circles on this geometric sphere, which then have a centre on this sphere that lies in the region designated as the interior of the circle. We caution though that this “Riemann sphere” centre does not always correspond to the Euclidean notion of the centre of a circle. For instance, the real line, with the upper half-plane designated as interior, will have as its Riemann sphere centre; if instead one designates the lower half-plane as the interior, the Riemann sphere centre will now be . We can then define a Riemann sphere circle packing in exact analogy with circle packings in , namely finite collections of Riemann sphere circles whose interiors are disjoint and whose union is connected; we also define the nerve as before. This is now a graph that can be drawn in the Riemann sphere, using great circle arcs in the Riemann sphere rather than line segments; it is also planar, since one can apply a Möbius transformation to move all the points and edges of the drawing away from infinity.

By Exercise 5, a maximal planar graph with at least three vertices can be drawn as a triangulation of the Riemann sphere. If there are at least four vertices, then it is easy to see that each vertex has degree at least three (a vertex of degree zero, one or two in a triangulation with simple edges will lead to a connected component of at most three vertices). It is a topological fact, not established here, that any two triangulations of such a graph are homotopic up to reflection (to reverse the orientation). If a Riemann sphere circle packing has the nerve of a maximal planar graph of at least four vertices, then we see that this nerve induces an explicit triangulation of the Riemann sphere by connecting the centres of any pair of tangent circles with the great circle arc that passes through the point of tangency. If was not maximal, one no longer gets a triangulation this way, but one still obtains a partition of the Riemann sphere into spherical polygons.

We remark that the triangles in this triangulation can also be described purely from the abstract graph . Define a *triangle* in to be a triple of vertices in which are all adjacent to each other, and such that the removal of these three vertices from does not disconnect the graph. One can check that there is a one-to-one correspondence between such triangles in a maximal planar graph and the triangles in any Riemann sphere triangulation of this graph.

Theorems 3, 4 are then a consequence of

Theorem 7 (Riemann sphere circle packing theorem)Let be a maximal planar graph with at least four vertices, drawn as a triangulation of the Riemann sphere. Then there exists a Riemann sphere circle packing with nerve whose triangulation is homotopic to the given triangulation. Furthermore, this packing is unique up to Möbius transformations.

Exercise 8Deduce Theorems 3, 4 from Theorem 7. (Hint:If one has a non-maximal planar graph for Theorem 3, add a vertex at the interior of each non-triangular face of a drawing of that graph, and connect that vertex to the vertices of the face, to create a maximal planar graph to which Theorem 4 or Theorem 7 can be applied. Then delete these “helper vertices” to create a packing of the original planar graph that does not contain any “unwanted” tangencies. You may use without proof the above assertion that any two triangulations of a maximal planar graph are homotopic up to reflection.)

Exercise 9Verify Theorem 7 when has exactly four vertices. (Hint:for the uniqueness, one can use Möbius transformations to move two of the circles to become parallel lines.)

To prove this theorem, we will make a reduction with regards to the existence component of Theorem 7. For technical reasons we will need to introduce a notion of non-degeneracy. Let be a maximal planar graph with at least four vertices, and let be a vertex in . As discussed above, the degree of is at least three. Writing the neighbours of in clockwise or counterclockwise order (with respect to a triangulation) as (starting from some arbitrary neighbour), we see that each is adjacent to and (with the conventions and ). We say that is *non-degenerate* if there are no further adjacencies between the , and if there is at least one further vertex in besides . Here is another characterisation:

Exercise 10Let be a maximal planar graph with at least four vertices, let be a vertex in , and let be the neighbours of . Show that the following are equivalent:

- (i) is non-degenerate.
- (ii) The graph is connected and non-empty, and every vertex in is adjacent to at least one vertex in .

We will then derive Theorem 7 from

Theorem 11 (Inductive step)Let be a maximal planar graph with at least four vertices , drawn as a triangulation of the Riemann sphere. Let be a non-degenerate vertex of , and let be the graph formed by deleting (and edges emenating from ) from . Suppose that there exists a Riemann sphere circle packing whose nerve isat least(that is, and are tangent whenever are adjacent in , although we also allow additional tangencies), and whose associated subdivision of the Riemann sphere into spherical polygons is homotopic to the given triangulation with removed. Then there is a Riemann sphere circle packing with nerve whose triangulation is homotopic to the given triangulation. Furthermore this circle packing is unique up to Möbius transformations.

Let us now see how Theorem 7 follows from Theorem 14. Fix as in Theorem 7. By Exercise 9 and induction we may assume that has at least five vertices, and that the claim has been proven for any smaller number of vertices.

First suppose that contains a non-degenerate vertex . Let be the the neighbours of . One can then form a new graph with one fewer vertex by deleting , and then connecting to (one can think of this operation as contracting the edge to a point). One can check that this is still a maximal planar graph that can triangulate the Riemann sphere in a fashion compatible with the original triangulation of (in that all the common vertices, edges, and faces are unchanged). By induction hypothesis, is the nerve of a circle packing that is compatible with this triangulation, and hence this circle packing has nerve at least . Applying Theorem 14, we then obtain the required claim for .

Now suppose that contains a degenerate vertex . Let be the neighbours of traversed in order. By hypothesis, there is an additional adjacency between the ; by relabeling we may assume that is adjacent to for some . The vertices in can then be partitioned as

where denotes those vertices in that lie in the region enclosed by the loop that does not contain , and denotes those vertices in that lie in the region enclosed by the loop that does not contain . One can then form two graphs , formed by restricting to the vertices and respectively; furthermore, these graphs are also maximal planar (with triangulations that are compatible with those of ). By induction hypothesis, we can find a circle packing with nerve , and a circle packing with nerve . Note that the circles are mutually tangent, as are . By applying a Möbius transformation one may assume that these circles agree, thus (cf. Exercise 9) , . The complement of the these three circles (and their interiors) determine two connected “interstitial” regions (that are in the shape of an arbelos, up to Möbius transformation); one can check that the remaining circles in will lie in one of these regions, and the remaining circles in lie in the other. Hence one can glue these circle packings together to form a single circle packing with nerve , which is homotopic to the given triangulation. Also, since a Möbius transformation that fixes three mutually tangent circles has to be the identity, the uniqueness of this circle packing up to Möbius transformations follows from the uniqueness for the two component circle packings , .

It remains to prove Theorem 7. To help fix the freedom to apply Möbius transformations, we can normalise the target circle packing so that is the exterior circle , thus all the other circles in the packing will lie in the closed unit disk . Similarly, by applying a suitable Möbius transformation one can assume that lies outside of the interior of all the circles in the original packing, and after a scaling one may then assume that all the circles lie in the unit disk .

At this point it becomes convenient to switch from the “elliptic” conformal geometry of the Riemann sphere to the “hyperbolic” conformal geometry of the unit disk . Recall that the Möbius transformations that preserve the disk are given by the maps

for real and (see Theorem 19 of these notes). It comes with a natural metric that interacts well with circles:

Exercise 12Define the Poincaré distance between two points of by the formulaGiven a measurable subset of , define the

hyperbolic areaof to be the quantitywhere is the Euclidean area element on .

- (i) Show that the Poincaré distance is invariant with respect to Möbius automorphisms of , thus whenever is a transformation of the form (1). Similarly show that the hyperbolic area is invariant with respect to such transformations.
- (ii) Show that the Poincaré distance defines a metric on . Furthermore, show that any two distinct points are connected by a unique geodesic, which is a portion of either a line or a circle that meets the unit circle orthogonally at two points. (
Hint:use the symmetries of (i) to normalise the points one is studying.)- (iii) If is a circle in the interior of , show that there exists a point in and a positive real number (which we call the
hyperbolic centerandhyperbolic radiusrespectively) such that . (In general, the hyperbolic center and radius will not quite agree with their familiar Euclidean counterparts.) Conversely, show that for any and , the set is a circle in .- (iv) If two circles in are externally tangent, show that the geodesic connecting the hyperbolic centers passes through the point of tangency, orthogonally to the two tangent circles.

Exercise 13 (Schwarz-Pick theorem)Let be a holomorphic map. Show that for all . If , show that equality occurs if and only if is a Möbius automorphism (1) of . (This result is known as the Schwarz-Pick theorem.)

We will refer to circles that lie in the closure of the unit disk as *hyperbolic circles*. These can be divided into the finite radius hyperbolic circles, which lie in the interior of the unit disk (as per part (iii) of the above exercise), and the horocycles, which are internally tangent to the unit circle. By convention, we view horocycles as having infinite radius, and having center at their point of tangency to the unit circle; they can be viewed as the limiting case of finite radius hyperbolic circles when the radius goes to infinity and the center goes off to the boundary of the disk (at the same rate as the radius, as measured with respect to the Poincaré distance). We write for the hyperbolic circle with hyperbolic centre and hyperbolic radius (thus either and , or and is on the unit circle); there is an annoying caveat that when there is more than one horocycle with hyperbolic centre , but we will tolerate this breakdown of functional dependence of on and in order to simplify the notation. A *hyperbolic circle packing* is a circle packing in which all circles are hyperbolic circles.

We also observe that the geodesic structure extends to the boundary of the unit disk: for any two distinct points in , there is a unique geodesic that connects them.

In view of the above discussion, Theorem 7 may now be formulated as follows:

Theorem 14 (Inductive step, hyperbolic formulation)Let be a maximal planar graph with at least four vertices , let be a non-degenerate vertex of , and let be the vertices adjacent to . Suppose that there exists a hyperbolic circle packing whose nerve is at least . Then there is a hyperbolic circle packing homotopic to such that the boundary circles , are all horocycles. Furthermore, this packing is unique up to Möbius automorphisms (1) of the disk .

Indeed, once one adjoints the exterior unit circle to , one obtains a Riemann sphere circle packing whose nerve is at least , and hence equal to since is maximal.

To prove this theorem, the intuition is to “inflate” the hyperbolic radius of the circles of until the boundary circles all become infinite radius (i.e., horocycles). The difficulty is that one cannot just arbitrarily increase the radius of any given circle without destroying the required tangency properties. The resolution to this difficulty given in the work of Beardon and Stephenson that we are following here was inspired by Perron’s method of subharmonic functions, in which one faced an analogous difficulty that one could not easily manipulate a harmonic function without destroying its harmonicity. There, the solution was to work instead with the more flexible class of subharmonic functions; here we similarly work with the concept of a *subpacking*.

We will need some preliminaries to define this concept precisely. We first need some hyperbolic trigonometry. We define a hyperbolic triangle to be the solid (and closed) region in enclosed by three distinct points in and the geodesic arcs connecting them. (Note that we allow one or more of the vertices to be on the boundary of the disk, so that the sides of the triangle could have infinite length.) Let be the space of triples with and not all of infinite. We say that a hyperbolic triangle with vertices is a *-triangle* if there are hyperbolic circles with the indicated hyperbolic centres and hyperbolic radii that are externally tangent to each other; note that this implies that the sidelengths opposite have length respectively (see Figure 3 of Beardon and Stephenson). It is easy to see that for any , there exists a unique -triangle in up to reflections and Möbius automorphisms (use Möbius transforms to fix two of the hyperbolic circles, and consider all the circles externally tangent to both of these circles; the case when one or two of the are infinite may need to be treated separately.). As a consequence, there is a well defined angle for subtended by the vertex of an triangle. We need some basic facts from hyperbolic geometry:

Exercise 15 (Hyperbolic trigonometry)

- (i) (Hyperbolic cosine rule) For any , show that the quantity is equal to the ratio
Furthermore, establish the limiting angles

(Hint: to facilitate computations, use a Möbius transform to move the vertex to the origin when the radius there is finite.) Conclude in particular that is continuous (using the topology of the extended real line for each component of ). Discuss how this rule relates to the Euclidean cosine rule in the limit as go to zero. Of course, by relabeling one obtains similar formulae for and .

- (ii) (Area rule) Show that the area of a hyperbolic triangle is given by , where are the angles of the hyperbolic triangle. (Hint: there are several ways to proceed. For instance, one can prove this for small hyperbolic triangles (of diameter ) up to errors of size after normalising as in (ii), and then establish the general case by subdividing a large hyperbolic triangle into many small hyperbolic triangles. This rule is also a special case of the Gauss-Bonnet theorem in Riemannian geometry. One can also first establish the case when several of the radii are infinite, and use that to derive finite cases.) In particular, the area of a -triangle is given by the formula
- (iii) Show that the area of the interior of a hyperbolic circle with is equal to .

Henceforth we fix as in Theorem 14. We refer to the vertices as *boundary vertices* of and the remaining vertices as *interior vertices*; edges between boundary vertices are *boundary edges*, all other edges will be called *interior edges* (including edges that have one vertex on the boundary). Triangles in that involve two boundary vertices (and thus necessarily one interior vertex) will be called *boundary triangles*; all other triangles (including ones that involve one boundary vertex) will be called *interior triangles*. To any triangle of , we can form the hyperbolic triangle with vertices ; this is an -triangle. Let denote the collection of such hyperbolic triangles; because is a packing, we see that these triangles have disjoint interiors. They also fit together in the following way: if is a side of a hyperbolic triangle in , then there will be another hyperbolic triangle in that shares that side precisely when is associated to an interior edge of . The union of all these triangles is homeomorphic to the region formed by starting with a triangulation of the Riemann sphere by and removing the triangles containing as a vertex, and is therefore homeomorphic to a disk. One can think of the collection of hyperbolic triangles, together with the vertices and edges shared by these triangles, as a two-dimensional (hyperbolic) simplicial complex, though we will not develop the full machinery of such complexes here.

Our objective is to find another hyperbolic circle packing homotopic to the existing circle packing , such at all the boundary circles (circles centred at boundary vertices) are horocycles. We observe that such a hyperbolic circle packing is completely described (up to Möbius transformations) by the hyperbolic radii of these circles. Indeed, suppose one knows the values of these hyperbolic radii. Then each hyperbolic triangle in is associated to a hyperbolic triangle whose sides and angles are known from Exercise 15. As the orientation of each hyperbolic triangle is fixed, each hyperbolic triangle is determined up to a Möbius automorphism of . Once one fixes one hyperbolic triangle, the adjacent hyperbolic triangles (that share a common side with the first triangle) are then also fixed; continuing in this fashion we see that the entire hyperbolic circle packing is determined.

On the other hand, not every choice of radii will lead to a hyperbolic circle packing with the required properties. There are two obvious constraints that need to be satisfied:

- (i) (Local constraint) The angles of all the hyperbolic triangles around any given interior vertex must sum to exactly .
- (ii) (Boundary constraint) The radii associated to boundary vertices must be infinite.

There could potentially also be a global constraint, in that one requires the circles of the packing to be disjoint – including circles that are not necessarily adjacent to each other. In general, one can easily create configurations of circles that are local circle packings but not global ones (see e.g., Figure 7 of Beardon-Stephenson). However, it turns out that one can use the boundary constraint and topological arguments to prevent this from happening. We first need a topological lemma:

Lemma 16 (Topological lemma)Let be bounded connected open subsets of with simply connected, and let be a continuous map such that and . Suppose furthermore that the restriction of to is a local homeomorphism. Then is in fact a global homeomorphism.

The requirement that the restriction of to be a local homeomorphism can in fact be relaxed to local injectivity thanks to the invariance of domain theorem. The complex numbers can be replaced here by any finite-dimensional vector space.

*Proof:* The preimage of any point in the interior of is closed, discrete, and disjoint from , and is hence finite. Around each point in the preimage, there is a neighbourhood on which is a homeomorphism onto a neighbourhood of . If one deletes the closure of these neighbourhoods, the image under is compact and avoids , and thus avoids a neighbourhood of . From this we can show that is a covering map from to . As the base is simply connected, it is its own universal cover, and hence (by the connectedness of ) must be a homeomorphism as claimed.

Proposition 17Suppose we assign a radius to each that obeys the local constraint (i) and the boundary constraint (ii). Then there is a hyperbolic circle packing with nerve and the indicated radii.

*Proof:* We first create the hyperbolic triangles associated with the required hyperbolic circle packing, and then verify that this indeed arises from a circle packing.

Start with a single triangle in , and arbitrarily select a -triangle with the same orientation as . By Exercise 15(i), such a triangle exists (and is unique up to Möbius automorphisms of the disk). If a hyperbolic triangle has been fixed, and (say) is an adjacent triangle in , we can select to be the unique -triangle with the same orientation as that shares the side in common with (with the and vertices agreeing). Similarly for other permutations of the labels. As is a maximal planar graph with non-degenerate (so in particular the set of internal vertices is connected), we can continue this construction to eventually fix every triangle in . There is the potential issue that a given triangle may depend on the order in which one arrives at that triangle starting from , but one can check from a monodromy argument (in the spirit of the monodromy theorem) using the local constraint (i) and the simply connected nature of the triangulation associated to that there is in fact no dependence on the order. (The process resembles that of laying down jigsaw pieces in the shape of hyperbolic triangles together, with the local constraint ensuring that there is always a flush fit locally.)

Now we show that the hyperbolic triangles have disjoint interiors inside the disk . Let denote the topological space formed by taking the disjoint union of the hyperbolic triangles (now viewed as abstract topological spaces rather than subsets of the disk) and then gluing together all common edges, e.g. identifying the edge of with the same edge of if and are adjacent triangles in . This space is homeomorphic to the union of the original hyperbolic triangles , and is thus homeomorphic to the closed unit disk. There is an obvious projection map from to the union of the , which maps the abstract copy in of a given hyperbolic triangle to its concrete counterpart in in the obvious fashion. This map is continuous. It does not quite cover the full closed disk, mainly because (by the boundary condition (ii)) the boundary hyperbolic triangles touch the boundary of the disk at the vertices associated to and but do not follow the boundary arc connecting these vertices, being bounded instead by the geodesic from the vertex to the vertex; the missing region is a lens-shaped region bounded by two circular arcs. However, by applying another homeomorphism (that does not alter the edges from to or to ), one can “push out” the edge of this hyperbolic triangle across the lens to become the boundary arc from to . If one performs this modification for each boundary triangle, one arrives at a modified continuous map from to , which now has the property that the boundary of maps to the boundary of the disk, and the interior of maps to the interior of the disk. Also one can check that this map is a local homeomorphism. By Lemma 16, is injective; undoing the boundary modifications we conclude that is injective. Thus the hyperbolic triangles have disjoint interiors. Furthermore, the arguments show that for each boundary triangle , the lens-shaped regions between the boundary arc between the vertices associated to and the corresponding edge of the boundary triangle are also disjoint from the hyperbolic triangles and from each other. On the other hand, all of the hyperbolic circles and in and their interiors are contained in the union of the hyperbolic triangles and the lens-shaped regions, with each hyperbolic triangle containing portions only of the hyperbolic circles with hyperbolic centres at the vertices of the triangle, and similarly for the lens-shaped regions. From this one can verify that the interiors of the hyperbolic circles are all disjoint from each other, and give a hyperbolic circle packing with the required properties.

In view of the above proposition, the only remaining task is to find an assignment of radii obeying both the local condition (i) and the boundary condition (ii). This is analogous to finding a harmonic function with specified boundary data. To do this, we perform the following analogue of Perron’s method. Define a *subpacking* to be an assignment of radii obeying the following

- (i’) (Local sub-condition) The angles around any given interior vertex sum to at least .

This can be compared with the definition of a (smooth) subharmonic function as one where the Laplacian is always at least zero. Note that we always have at least one subpacking, namely the one provided by the radii of the original hyperbolic circle packing . Intuitively, in each subpacking, the radius at an interior vertex is either “too small” or “just right”.

We now need a key monotonicity property, analogous to how the maximum of two subharmonic functions is again subharmonic:

- (i) Show that the angle (as defined in Exercise 15(i)) is strictly decreasing in and strictly increasing in or (if one holds the other two radii fixed). Do these claims agree with your geometric intuition?
- (ii) Conclude that whenever and are subpackings, that is also a subpacking.
- (iii) Let be such that for . Show that , with equality if and only if for all . (
Hint:increase just one of the radii . One can either use calculus (after first disposing of various infinite radii cases) or one can argue geometrically.)

As with Perron’s method, we can now try to construct a hyperbolic circle packing by taking the supremum of all the subpackings. To avoid degeneracies we need an upper bound:

Proposition 19 (Upper bound)Let be a subpacking. Then for any interior vertex of degree , one has .

The precise value of is not so important for our arguments, but the fact that it is finite will be. This boundedness of interior circles in a circle packing is a key feature of hyperbolic geometry that is not present in Euclidean geometry, and is one of the reasons why we moved to a hyperbolic perspective in the first place.

*Proof:* By the subpacking property and pigeonhole principle, there is a triangle in such that . The hyperbolic triangle associated to has area at most by (2); on the other hand, it contains a sector of a hyperbolic circle of radius and angle , and hence has area at least , thanks to Exercise 15(iv). Comparing the two bounds gives the claim.

Now define to be the (pointwise) supremum of all the subpackings. By the above proposition, is finite at every interior vertex. By Exercise 18, one can view as a monotone increasing limit of subpackings, and is thus again a subpacking (due to the continuity properties of as long as at least one of the radii stays bounded); thus is the maximal subpacking. On the other hand, if is finite at some boundary vertex, then by Exercise 18(i) one could replace that radius by a larger quantity without destroying the subpacking property, contradicting the maximality of . Thus all the boundary radii are infinite, that is to say the boundary condition (ii) holds. Finally, if the sum of the angles at an interior vertex is strictly greater than , then by Exercise 18 we could increase the radius at this vertex slightly without destroying the subpacking property at or at any other of the interior vertices, again contradicting the maximality of . Thus obeys the local condition (i), and we have demonstrated existence of the required hyperbolic circle packing.

Finally we establish uniqueness. It suffices to establish that is the unique tuple that obeys the local condition (i) and the boundary condition (ii). Suppose we had another tuple other than that obeyed these two conditions. Then by the maximality of , we have for all . By Exercise 18(iii), this implies that

for any triangle in . Summing over all triangles and using (2), we conclude that

where the inner sum is over the pairs such that forms a triangle in . But by the local condition (i) and the boundary condition (ii), the inner sum on either side is equal to for an interior vertex and for a boundary vertex. Thus the two sides agree, which by Exercise 18(iii) implies that for all . This proves Theorem 14 and thus Theorems 7, 3, 4.

** — 2. Quasiconformal maps — **

In this section we set up some of the foundational theory of quasiconformal mapping, which are generalisations of the conformal mapping concept that can tolerate some deviations from perfect conformality, while still retaining many of the good properties of conformal maps (such as being preserved under uniform limits), though with the notable caveat that in contrast to conformal maps, quasiconformal maps need not be smooth. As such, this theory will come in handy when proving convergence of circle packings to the Riemann map. The material here is largely drawn from the text of Lehto and Virtanen.

We first need the following refinement of the Riemann mapping theorem, known as Carathéodory’s theorem:

Theorem 20 (Carathéodory’s theorem)Let be a bounded simply connected domain in whose boundary is a Jordan curve, and let be a conformal map between and (as given by the Riemann mapping theorem). Then extends to a continuous homeomorphism from to .

The condition that be a Jordan curve is clearly necessary, since if is not simple then there are paths in that end up at different points in but have the same endpoint in after applying , which prevents being continuously extended to a homeomorphism.

*Proof:* We first prove continuous extension to the boundary. It suffices to show that for every point on the boundary of the unit circle, the diameters of the sets go to zero for some sequence of radii .

First observe from the change of variables formula that the area of is given by , where denotes Lebesgue measure (or the area element). In particular, this integral is finite. Expanding in polar coordinates around , we conclude that

Since diverges near , we conclude from the pigeonhole principle that there exists a sequence of radii decreasing to zero such that

and hence by Cauchy-Schwarz

If we let denote the circular arc , we conclude from this and the triangle inequality (and chain rule) that is a rectifiable curve with length going to zero as . Let denote the endpoints of this curve. Clearly they lie in . If (say) was in , then as is a homeomorphism from to , would have one endpoint in rather than , which is absurd. Thus lies in , and similarly for . Since the length of goes to zero, the distance between and goes to zero. Since is a Jordan curve, it can be parameterised homeomorphically by , and so by compactness we also see that the distance between the parameterisations of and in must also go to zero, hence (by uniform continuity of the inverse parameterisation) and are connected along by an arc whose diameter goes to zero. Combining this arc with , we obtain a Jordan curve of diameter going to zero which separates from the rest of . Sending to infinity, we see that (which decreases with ) must eventually map in the interior of this curve rather than the exterior, and so the diameter goes to zero as claimed.

The above construction shows that extends to a continuous map (which by abuse of notation we continue to call ) from to , and the proof also shows that maps to . As is a compact subset of that contains , it must surject onto . As both and are compact Hausdorff spaces, we will now be done if we can show injectivity. The only way injectivity can fail is if there are two distinct points on that map to the same point. Let be the line segment connecting with , then is a Jordan curve in that meets only at . divides into two regions; one of which must map to the interior of , which implies that there is an entire arc of which maps to the single point . But then by the Schwarz reflection principle, extends conformally across this arc and is constant in a non-isolated set, thus is constant everywhere by analytic continuation, which is absurd. This establishes the required injectivity.

This has the following consequence. Define a *Jordan quadrilateral* to be the open region enclosed by a Jordan curve with four distinct marked points on it in counterclockwise order, which we call the *vertices* of the quadrilateral. The arcs in connecting to or to will be called the *-sides*; the arcs connecting to or to will be called *-sides*. (Thus for instance each cyclic permutation of the vertices will swap the -sides and -sides, while keeping the interior region unchanged.) A key example of a Jordan quadrilateral are the (Euclidean) rectangles, in which the vertices are the usual corners of the rectangle, traversed counterclockwise. The -sides then are line segments of some length , and the -sides are line segments of some length that are orthogonal to the -sides. A *vertex-preserving conformal map* from one Jordan quadrilateral to another will be a conformal map that extends to a homeomorphism from to that maps the corners of to the respective corners of (in particular, -sides get mapped to -sides, and similarly for -sides).

Exercise 21Let be a Jordan quadrilateral with vertices .

- (i) Show that there exists and a conformal map to the upper half-plane (viewed as a subset of the Riemann sphere) that extends continuously to a homeomorphism and which maps to respectively. (Hint: first map to increasing elements of the real line, then use the intermediate value theorem to enforce .)
- (ii) Show that there is a vertex-preserving conformal map from to a rectangle (Hint: use Schwarz-Christoffel mapping.)
- (iii) Show that the rectangle in part (ii) is unique up to affine transformations. (
Hint:if one has a conformal map between rectangles that preserves the vertices, extend it via repeated use of the Schwarz reflection principle to an entire map.)

This allows for the following definition: the *conformal modulus* (or *modulus* for short, also called *module* in older literature) of a Jordan quadrilateral with vertices is the ratio , where are the lengths of the -sides and -sides of a rectangle that is conformal to in a vertex-preserving vashion.. This is a number between and ; each cyclic permutation of the vertices replaces the modulus with its reciprocal. It is clear from construction that the modulus of a Jordan quadrilateral is unaffected by vertex-preserving conformal transformations.

Now we define quasiconformal maps. Informally, conformal maps are homeomorphisms that map infinitesimal circles to infinitesimal circles; quasiconformal maps are homeomorphisms that map infinitesimal circles to curves that differ from an infinitesimal circle by “bounded distortion”. However, for the purpose of setting up the foundations of the theory, it is slightly more convenient to work with rectangles instead of circles (it is easier to partition rectangles into subrectangles than disks into subdisks). We therefore introduce

Definition 22Let . An orientation-preserving homeomorphism between two domains in is said to be-quasiconformalif one has for every Jordan quadrilateral in . (In these notes, we do not consider orientation-reversing homeomorphisms to be quasiconformal.)

Note that by cyclically permuting the vertices of , we automatically also obtain the inequality

or equivalently

for any Jordan quadrilateral. Thus it is not possible to have any -quasiconformal maps for (excluding the degenerate case when are empty), and a map is -conformal if and only if it preserves the modulus. In particular, conformal maps are -conformal; we will shortly establish that the converse claim is also true. It is also clear from the definition that the inverse of a -quasiconformal map is also -quasiconformal, and the composition of a -quasiconformal map and a -quasiconformal map is a -quasiconformal map.

It is helpful to have an alternate characterisation of the modulus that does not explicitly mention conformal mapping:

Proposition 23 (Alternate definition of modulus)Let be a Jordan quadrilateral with vertices . Then is the smallest quantity with the following property: for any Borel measurable one can find a curve in connecting one -side of to another, and which is locally rectifiable away from endpoints, such thatwhere denotes integration using the length element of (not to be confused with the contour integral ).

The reciprocal of this notion of modulus generalises to the concept of extremal length, which we will not develop further here.

*Proof:* Observe from the change of variables formula that if is a vertex-preserving conformal mapping between Jordan quadrilaterals , and is a locally rectifiable curve connecting one -side of to another, then is a locally rectifiable curve connecting one -side of to another, with

and

As a consequence, if the proposition holds for it also holds for . Thus we may assume without loss of generality that is a rectangle, which we may normalise to be with vertices , so that the modulus is . For any measurable , we have from Cauchy-Schwarz and Fubini’s theorem that

and hence by the pigeonhole principle there exists such that

On the other hand, if we set , then , and for any curve connecting the -side from to to the -side from to , we have

Thus is the best constant with the required property, proving the claim.

Here are some quick and useful consequences of this characterisation:

**Note: it would be more logical to reverse the order of the next two exercises, since Exercise 25 can be used to prove Exercise 24. I will do this at the conclusion of this course.**

- (i) If are disjoint Jordan quadrilaterals that share a common -side, and which can be glued together along this side to form a new Jordan quadrilateral , show that . If equality occurs, show that after conformally mapping to a rectangle (in a vertex preserving fashion), , are mapped to subrectangles (formed by cutting the original parallel to the -side).
- (ii) If are disjoint Jordan quadrilaterals that share a common -side, and which can be glued together along this side to form a new Jordan quadrilateral , show that . If equality occurs, show that after conformally mapping to a rectangle (in a vertex preserving fashion), , are mapped to subrectangles (formed by cutting the original parallel to the -side).

Exercise 25 (Rengel’s inequality)Let be a Jordan quadrilateral of area , let be the shortest (Euclidean) distance between a point on one -side and a point on the other -side, and similarly let be the shortest (Euclidean) distance between a point on one -side and a point on the other -side. Show thatand that equality in either case occurs if and only if is a rectangle.

Exercise 26 (Continuity from below)Suppose is a sequence of Jordan quadrilaterals which converge to another Jordan quadrilateral , in the sense that the vertices of converge to their respective counterparts in , each -side in converges (in the Hausdorff sense) to the -side of , and the similarly for -sides. Suppose also that for all . Show that converges to . (Hint:map to a rectangle and use Rengel’s inequality.)

Proposition 27 (Local quasiconformality implies quasiconformality)Let , and let be an orientation-preserving homeomorphism between complex domains which is locally -quasiconformal in the sense that for every there is a neighbourhood of in such that is -quasiconformal from to . Then is -quasiconformal.

*Proof:* We need to show that for any Jordan quadrilateral in . The hypothesis gives this claim for all quadrilaterals in the sufficiently small neighbourhood of any point in . For any natural number , we can subdivide into quadrilaterals with modulus with adjacent -sides, by first conformally mapping to a rectangle and then doing an equally spaced vertical subdivision. Similarly, each quadrilateral can be subdivided into quadrilaterals of modulus by mapping to a rectangle and performing horizontal subdivision. By the local -quasiconformality of , we will have

for all , if is large enough. By superadditivity this implies that

for each , and hence

Applying superadditivity again we obtain

giving the claim.

We can now reverse the implication that conformal maps are -conformal:

*Proof:* By covering by quadrilaterals we may assume without loss of generality that (and hence also ) is a Jordan quadrilateral; by composing on left and right with conformal maps we may assume that and are rectangles. As is -conformal, the rectangles have the same modulus, so after a further affine transformation we may assume that is the rectangle with vertices for some modulus . If one subdivides into two rectangles along an intermediate vertical line segment connecting say to for some , the moduli of these rectangles are and . Applying the -conformal map and the converse portion of Exercise 24, we conclude that these rectangles must be preserved by , thus preserves the coordinate. Similarly preserves the coordinate, and is therefore the identity map, which is of course conformal.

Next, we can give a simple criterion for quasiconformality in the continuously differentiable case:

Theorem 29Let , and let be an orientation-preserving diffeomorphism (a continuously (real) differentiable homeomorphism whose derivative is always nondegenerate) between complex domains . Then the following are equivalent:

- (i) is -quasiconformal.
- (ii) For any point and phases , one has
where denotes the directional derivative.

*Proof:* Let us first show that (ii) implies (i). Let be a Jordan quadrilateral in ; we have to show that . From the chain rule one can check that condition (ii) is unchanged by composing with conformal maps on the left or right, so we may assume without loss of generality that and are rectangles; in fact we may normalise to have vertices and to have vertices where and . From the change of variables formula (and the singular value decomposition), followed by Fubini’s theorem and Cauchy-Schwarz, we have

3 and hence , giving the claim.

Now suppose that (ii) failed, then by the singular value decomposition we can find and a phase such that

for some real with . After translations and rotations we may normalise so that

But then from Rengel’s inequality and Taylor expansion one sees that will map a unit square with vertices to a quadrilateral of modulus converging to as , contradicting (i).

Exercise 30Show that the conditions (i), (ii) in the above theorem are also equivalent to the boundfor all , where

are the Wirtinger derivatives.

We now prove a technical regularity result on quasiconformal maps.

Proposition 31 (Absolute continuity on lines)Let be a -quasiconformal map between two complex domains for some . Suppose that contains the closed rectangle with endpoints . Then for almost every , the map is absolutely continuous on .

*Proof:* For each , let denote the area of the image of the rectangle with endpoints . This is a bounded monotone function on and is hence differentiable almost everywhere. It will thus suffice to show that the map is absolutely continuous on whenever is a point of differentiability of .

Let , and let be disjoint intervals in of total length . To show absolute continuity, we need a bound on that goes to zero as uniformly in the choice of intervals. Let be a small number (that can depend on the intervals), and for each let be the rectangle with vertices , , , This rectangle has modulus , and hence has modulus at most . On the other hand, by Rengel’s inequality this modulus is at least , where is a quantity that goes to zero as (holding the intervals fixed). We conclude that

On the other hand, we have

By Cauchy-Schwarz, we thus have

sending , we conclude

giving the claim.

Exercise 32Let be a -quasiconformal map between two complex domains for some . Suppose that there is a closed set of Lebesgue measure zero such that is conformal on . Show that is -conformal (and hence conformal, by Proposition 28). (Hint:Arguing as in the proof of Theorem 29, it suffices to show that of maps the rectangle with endpoints to the rectangle with endpoints , then . Repeat the proof of that theorem, using the absolute continuity of lines at a crucial juncture to justify using the fundamental theorem of calculus.)

Recall Hurwitz’s theorem that the locally uniform limit of conformal maps is either conformal or constant. It turns out there is a similar result for quasiconformal maps. We will just prove a weak version of the result (see Theorem II.5.5 of Lehto-Virtanen for the full statement):

Theorem 33Let , and let be a sequence of -quasiconformal maps that converge locally uniformly to an orientation-preserving homeomorphism . Then is also -quasiconformal.

It is important for this theorem that we do not insist that quasiconformal maps are necessarily differentiable. Indeed for applications to circle packing we will be working with maps that are only piecewise smooth, or possibly even worse, even though at the end of the day we will recover a smooth conformal map in the limit.

*Proof:* Let be a Jordan quadrilateral in . We need to show that . By restricting we may assume . By composing with a conformal map we may assume that is a rectangle. We can write as the increasing limit of rectangles of the same modulus, then for any we have . By choosing going to infinity sufficiently rapidly, stays inside and converges to in the sense of Exercise 26, and the claim then follows from that exercise.

Another basic property of conformal mappings (a consequence of Morera’s theorem) is that they can be glued along a common edge as long as the combined map is also a homeomorphism; this fact underlies for instance the Schwarz reflection principle. We have a quasiconformal analogue:

Theorem 34Let , and let be an orientation-preserving homeomorphism. Let be a real analytic (and topologically closed) contour that lies in except possibly at the endpoints. If is -quasiconformal, then is -quasiconformal.

We will generally apply this theorem in the case when disconnects into two components, in which case can be viewed as the gluing of the restrictions of this map to the two components.

*Proof:* As in the proof of the previous theorem, we may take to be a rectangle , and it suffices to show that . We may normalise to have vertices where , and similarly normalise to be a rectangle of vertices , so we now need to show that . The real analytic contour meets in a finite number of curves, which can be broken up further into a finite horizontal line segments and graphs for various closed intervals and real analytic . For any , we can then use the uniform continuity of the to subdivide into a finite number of rectangles where on each such rectangle, meets the interior of in a bounded number of graphs whose horizontal variation is . This subdivides into a bounded number of Jordan quadrilaterals . If we let denote the distance between the -sides of , then by uniform continuity of and the triangle inequality we have

as . By Rengel’s inequality, we have

since , we conclude using superadditivity that

and hence by Cauchy-Schwarz

and thus

Summing in , we obtain

giving the desired bound after sending .

It will be convenient to study analogues of the modulus when quadrilaterals are replaced by generalisations of annuli. We define a *ring domain* to be a region bounded between two Jordan curves , where (the inner boundary) is contained inside the interior of (the outer boundary). For instance, the annulus is a ring domain for any and . In the spirit of Proposition 23, define the *modulus* of a ring domain to be the supremum of all the quantities with the following property: for any Borel measurable one can find a rectifable curve in winding once around the inner boundary , such that

We record some basic properties of this modulus:

**these exercises should be in reverse order – this will be done after the course concludes**

- (i) Show that the modulus of an annulus is given by .
- (ii) Show that if is -quasiconformal and is an ring domain in , then . In particular, the modulus is a conformal invariant. (There is also a converse to this statement that allows for a definition of -quasiconformality in terms of the modulus of ring domains; see e.g. Theorem 7.2 of Lehto-Virtanen.)
- (iii) Show that if one ring domain is contained inside another (with the inner boundary of in the interior of the inner boundary of ), then .

Exercise 36Show that every ring domain is conformal to an annulus. (There are several ways to proceed here. One is to start by using Perron’s method to construct a harmonic function that is on one of the boundaries of the annulus and on the other. Another is to apply a logarithm map to transform the annulus to a simply connected domain with a “parabolic” group of discrete translation symmetries, use the Riemann mapping theorem to map this to a disc, and use the uniqueness aspect of the Riemann mapping theorem to figure out what happens to the symmetry.) Use this to give an alternate definition of the modulus of a ring domain that is analogous to the original definition of the modulus of a quadrilateral.

As a basic application of this concept we have the fact that the complex plane cannot be quasiconformal to any proper subset:

Proposition 37Let be a -quasiconformal map for some ; then .

*Proof:* As is homeomorphic to , it is simply connected. Thus, if we assume for contradiction that , then by the Riemann mapping theorem is conformal to , so we may assume without loss of generality that .

By Exercise 35(i), the moduli of the annuli goes to infinity as , and hence (by Exercise 35(ii) (applied to ) the moduli of the ring domains must also go to infinity. However, as the inner boundary of this domain is fixed and the outer one is bounded, all these ring domains can be contained inside a common annulus, contradicting Exercise 35(iii).

For some further applications of the modulus of ring domains, we need the following result of Grötzsch:

Theorem 38 (Grötzsch modulus theorem)Let , and let be the ring domain formed from by deleting the line segment from to . [Technically, is not quite a ring domain as defined above, but one can check that the definition of modulus, and the fact that is conformal to an annulus, remains valid.] Let be another ring domain contained in whose inner boundary encloses both and . Then .

*Proof:* Let , then by Exercise 36 we can find a conformal map from to the annulus . As is symmetric around the real axis, and the only conformal automorphisms of the annulus that preserve the inner and outer boundaries are rotations (as can be seen for instance by using the Schwarz reflection principle repeatedly to extend such automorphisms to an entire function of linear growth), we may assume that obeys the symmetry . Let be the function , then is symmetric around the real axis. One can view as a measurable function on ; from the change of variables formula we have

so in particular is square-integrable. Our task is to show that ; by the definition of modulus, it suffices to show that

for any rectifiable curve that goes once around , and thus once around and in . By a limiting argument we may assume that is polygonal. By repeatedly reflecting around the real axis whenever crosses the line segment between and , we may assume that does not actually cross this segment, and then by perturbation we may assume it is contained in . But then by change of variables we have

by the Cauchy integral formula, and the claim follows.

Exercise 39Let be a sequence of -quasiconformal maps for some , such that all the are uniformly bounded. Show that the are a normal family, that is to say every sequence in contains a subsequence that converges locally uniformly. (Hint:use an argument similar to that in the proof of Proposition 37, combined with Theorem 38, to establish some equicontinuity of the .)

There are many further basic properties of the conformal modulus for both quadrilaterals and annuli; we refer the interested reader to Lehto-Virtanen for details.

** — 3. Rigidity of the hexagonal circle packing — **

We return now to circle packings. In order to understand finite circle packings, it is convenient (in order to use some limiting arguments) to consider some infinite circle packings. A basic example of an infinite circle packing is the *regular hexagonal circle packing*

where is the hexagonal lattice

and is the unit circle centred at . This is clearly an (infinite) circle packing, with two circles in this packing (externally) tangent if and only if they differ by twice a sixth root of unity. Between any three mutually tangent circles in this packing is an open region that we will call an *interstice*. It is inscribed in a *dual circle* that meets the three original circles orthogonally and can be computed to have radius ; the interstice can then be viewed as a hyperbolic triangle in this dual circle in which all three sides have infinite length. Let denote the union of all the interstices.

For every circle in this circle packing, we can form the inversion map across this circle on the Riemann sphere, defined by setting

for and , with the convention that maps to and vice versa. These are conjugates of Möbius transformations; they preserve the circle and swap the interior with the exterior. Let be the group of transformations of generated by these inversions ; this is essentially a Schottky group (except for the fact that we are are allowing for conjugate Möbius transformations in addition to ordinary Möbius transformations). Let be the union of the images of the interstitial regions under all of these transformations. We have the following basic fact:

Proposition 40has Lebesgue measure zero.

*Proof:* (Sketch) I thank Mario Bonk for this argument. Let denote all the circles formed by applying an element of to the circles in . If lies in , then it lies inside one of the circles in , and then after inverting through that circle it lies in another circle in , and so forth; undoing the inversions, we conclude that lies in infinite number of nested circles. Let be one of these circles. contains a union of six interstices bounded by and a cycle of six circles internally tangent to and consecutively externally tangent to each other. Applying the same argument used to establish the ring lemma (Lemma 41), we see that the six internal circles have radii comparable to that of , and hence has density in the disk enclosed by , which also contains . The ring lemma also shows that the radius of each circle in the nested sequence is at most times the one enclosing it for some absolute constant , so in particular the disks shrink to zero in size. Thus cannot be a point of density of , and hence by the Lebesgue density theorem this set has measure zero.

**these two lemmas should be moved to the front of the section.**

Next we need two simple geometric lemmas, due to Rodin and Sullivan.

Lemma 41 (Ring lemma)Let be a circle that is externally tangent to a chain of circles with disjoint interiors, with each externally tangent to (with the convention ). Then there is a constant depending only on , such that the radii of each of the is at least times the radius of .

*Proof:* Without loss of generality we may assume that has radius and that the radius of is maximal among the radii of the . As the polygon connecting the centers of the has to contain , we see that . This forces , for if was too small then would be so deep in the cuspidal region between and that it would not be possible for to escape this cusp and go around . A similar argument then gives , and so forth, giving the claim.

Lemma 42 (Length-area lemma)Let , and let consist of those circles in that can be connected to the circle by a path of length at most (going through consecutively tangent circles in ). Let be circle packing with the same nerve as that is contained in a disk of radius . Then the circle in associated to the circle in has radius .

The point of this bound is that when is bounded and , the radius of is forced to go to zero.

*Proof:* We can surround by disjoint chains of consecutively tangent circles , in . Each circle is associated to a corresponding circle in of some radius . The total area of these circles is at most the area of the disk of radius . Since , this implies from the pigeonhole principle that there exists for which

and hence by Cauchy-Schwarz

Connecting the centers of these circles, we obtain a polygonal path of length that goes around , and the claim follows.

We also need another simple geometric observation:

Exercise 43Let be mutually externally tangent circles, and let be another triple of mutually external circles, with the same orientation (e.g. and both go counterclockwise around their interstitial region). Show that there exists a Möbius transformation that maps each to and which maps the interstice of conformally onto the interstice of .

Now we can give a rigidity result for the hexagonal circle packing, somewhat in the spirit of Theorem 4 (though it does not immediately follow from that theorem), and also due to Rodin and Sullivan:

Proposition 44 (Rigidity of infinite hexagonal packing)Let be an infinite circle packing in with the same nerve as the hexagonal circle packing . Then is in fact equal to the hexagonl circle packing up to affine transformations and reflections.

*Proof:* By applying a reflection we may assume that and have the same orientation. For each interstitial region of there is an associated interstitial region of , and by Exercise 43 there is a Möbius transformation . These can be glued together to form a map that is initially defined (and conformal) on the interstitial regions ; we would like to extend it to the entire complex plane by defining it also inside the circles .

Now consider a circle in . It is bounded by six interstitial regions , which map to six interstitial regions that lie between the circle corresponding to and six tangent circles . By the ring lemma, all of the circles have radii comparable to the radius of . As a consequence, the map , which is defined (and piecewise Möbius) on the boundary of as a map to the boundary of , has derivative comparable in magnitude to also. By extending this map radially (in the sense of defining for and , where is the centre of , we see from Theorem 29 that we can extend to be -quasiconformal in the interior of except possibly at for some , and to a homeomorphism from to the region consisting of the union of the disks in and their interstitial regions. By many applications of Theorem 34, is now -quasiconformal on all of , and conformal in the interstitial regions . By Proposition 37, surjects onto , thus the circle packing and all of its interstitial regions cover the entire complex plane.

Next, we use a version of the Schwarz reflection principle to replace by another -quasiconformal map that is conformal on a larger region than . Namely, pick a circle in , and let be the corresponding circle in . Let and be the inversions across and respectively. Note that maps the circle to , with the interior mapping to the interior and exterior mapping to the exterior. We can then define a modified map by setting equal to on or outside , and equal to inside (with the convention that maps to ). This is still an orientation-preserving function ; by Theorem 34 it is still -quasiconformal. It remains conformal on the interstitial region , but is now also conformal on the additional interstitial region . Repeating this construction one can find a sequence of -quasiconformal maps that map each circle to their counterparts , and which are conformal on a sequence of sets that increase up to . By Exercise 39, the restriction of to any compact set forms a normal family (the fact that the circles map to the circles will give the required uniform boundedness for topological reasons), and hence (by the usual diagonalisation argument) the themselves are a normal family; similarly for . Thus, by passing to a subsequence, we may assume that the converge locally uniformly to a limit , and that also converge locally uniformly to a limit which must then invert . Thus is a homeomorphism, and thus -quasiconformal by Theorem 33. It is conformal on , and hence by Proposition 32 it is conformal. But the only conformal maps of the complex plane are the affine maps (see Proposition 15 of this previous blog post), and hence is an affine copy of as required.

By a standard limiting argument, the perfect rigidity of the infinite circle packing can be used to give approximate rigidity of finite circle packings:

Corollary 45 (Approximate rigidity of finite hexagonal packings)Let , and suppose that is sufficiently large depending on . Let and be as in Lemma 42. Let be the radius of the circle in associated to , and let be the radius of an adjacent circle . Then .

*Proof:* We may normalise and . Suppose for contradiction that the claim failed, then one can find a sequence tending to infinity, and circle packings with nerve with , such that the radius of the adjacent circle stays bounded away from . By many applications of the ring lemma, for each circle of , the corresponding circle in has radius bounded above and below by zero. Passing to a subsequence using Bolzano-Weierstrass and using the Arzela-Ascoli diagonalisation argument, we may assume that the radii of these circles converge to a positive finite limit . Applying a rotation we may also assume that the circles converge to a limit circle (using the obvious topology on the space of circles); we can also assume that the orientation of the does not depend on . A simple induction then shows that converges to a limit circle , giving a circle packing with the same nerve as . But then by Lemma 44, is an affine copy of , which among other things implies that . Thus converges to , giving the required contradiction.

A more quantitative version of this corollary was worked out by He. There is also a purely topological proof of the rigidity of the infinite hexagonal circle packing due to Schramm.

** — 4. Approximating a conformal map by circle packing — **

Let be a simply connected bounded region in with two distinct distinguished points . By the Riemann mapping theorem, there is a unique conformal map that maps to and to a positive real. However, many proofs of this theorem are rather nonconstructive, and do not come with an effective algorithm to locate, or at least approximate, this map .

It was conjectured by Thurston, and later proven by Rodin and Sullivan, that one could achieve this by applying the circle packing theorem (Theorem 3) to a circle packing in by small circles. To formalise this, we need some more notation. Let be a small number, and let be the infinite hexagonal packing scaled by . For every circle in , define the *flower* to be the union of this circle, its interior, the six interstices bounding it, and the six circles tangent to the circle (together with their interiors). Let be a circle in such that lies in its flower. For small enough, this flower is contained in . Let denote all circles in that can be reached from by a finite chain of consecutively tangent circles in , whose flowers all lie in . Elements of will be called *inner circles*, and circles in that are not an inner circle but are tangent to it will be called *border circles*. Because is simply connected, the union of all the flowers of inner circles is also simply connected. As a consequence, one can traverse the border circles by a cycle of consecutively tangent circles, with the inner circles enclosed by this cycle. Let be the circle packing consisting of the inner circles and border circles. Applying Theorem 3 followed by a Möbius transformation, one can then find a circle packing in with the same nerve and orientation as , such that all the circles in associated to border circles of are internally tangent to . Applying a Möbius transformation, we may assume that the flower containing in is mapped to the flower containing , and the flower containing is mapped to a flower containing a positive real. (From the exercise below will lie in such a flower for small enough.)

Let be the union of all the solid closed equilateral triangles formed by the centres of mutually tangent circles in , and let be the corresponding union of the solid closed triangles from . Let be the piecewise affine map from to that maps each triangle in to the associated triangle in .

Exercise 46Show that converges to as in the Hausdorff sense. In particular, lies in for sufficiently small .

Exercise 47By modifying the proof of the length-area lemma, show that all the circles in have radius that goes uniformly to zero as . (Hint: for circles deep in the interior, the length-area lemma works as is; for circles near the boundary, one has to encircle by a sequence of chains that need not be closed, but may instead terminate on the boundary of . The argument may be viewed as a discrete version of the one used to prove Theorem 20.) Using this and the previous exercise, show that converges to in the Hausdorff sense.

From Corollary 45 we see that as , the circles in corresponding to adjacent circles of in a fixed compact subset of have radii differing by a ratio of . We conclude that in any compact subset of , adjacent circles in in also have radii differing by a ratio of , which implies by trigonometry that the triangles of in are approximately equilateral in the sense that their angles are . By Theorem 29 is -quasiconformal on each such triangle, and hence by Theorem 34 it is -quasiconformal on . By Exercise 39 every sequence of has a subsequence which converges locally uniformly on , and whose inverses converge locally uniformly on ; the limit is then a homeomorphism from to that maps to and to a positive real. By Theorem 33 the limit is locally -conformal and hence conformal, hence by uniqueness of the Riemann mapping it must equal . As is the unique limit point of all subsequences of the , this implies (by the Urysohn subsequence principle) that converges locally uniformly to , thus making precise the sense in which the circle packings converge to the Riemann map.

]]>where is the identity operator and is the commutator. Among other things, this equation is fundamental in quantum mechanics, leading for instance to the Heisenberg uncertainty principle.

The operators are unbounded on spaces such as . One can ask whether the commutator equation (1) can be solved using bounded operators on a Hilbert space rather than unbounded ones. In the finite dimensional case when are just matrices for some , the answer is clearly negative, since the left-hand side of (1) has trace zero and the right-hand side does not. What about in infinite dimensions, when the trace is not available? As it turns out, the answer is still negative, as was first worked out by Wintner and Wielandt. A short proof can be given as follows. Suppose for contradiction that we can find bounded operators obeying (1). From (1) and an easy induction argument, we obtain the identity

for all natural numbers . From the triangle inequality, this implies that

Iterating this, we conclude that

for any . Bounding and then sending , we conclude that , which clearly contradicts (1). (Note the argument can be generalised without difficulty to the case when lie in a Banach algebra, rather than be bounded operators on a Hilbert space.)

It was observed by Popa that there is a quantitative version of this result:

Theorem 1Let such that

*Proof:* By multiplying by a suitable constant and dividing by the same constant, we may normalise . Write with . Then the same induction that established (2) now shows that

and hence by the triangle inequality

We divide by and sum to conclude that

giving the claim.

Again, the argument generalises easily to any Banach algebra. Popa then posed the question of whether the quantity can be replaced by any substantially larger function of , such as a polynomial in . As far as I know, the above simple bound has not been substantially improved.

In the opposite direction, one can ask for constructions of operators that are not too large in operator norm, such that is close to the identity. Again, one cannot do this in finite dimensions: has trace zero, so at least one of its eigenvalues must outside the disk , and therefore for any finite-dimensional matrices .

However, it was shown in 1965 by Brown and Pearcy that in infinite dimensions, one can construct operators with arbitrarily close to in operator norm (in fact one can prescribe any operator for as long as it is not equal to a non-zero multiple of the identity plus a compact operator). In the above paper of Popa, a quantitative version of the argument (based in part on some earlier work of Apostol and Zsido) was given as follows. The first step is to observe the following Hilbert space version of Hilbert’s hotel: in an infinite dimensional Hilbert space , one can locate isometries obeying the equation

where denotes the adjoint of . For instance, if has a countable orthonormal basis , one could set

and

where denotes the linear functional on . Observe that (4) is again impossible to satisfy in finite dimension , as the left-hand side must have trace while the right-hand side has trace .

Multiplying (4) on the left by and right by , or on the left by and right by , then gives

From (4), (5) we see in particular that, while we cannot express as a commutator of bounded operators, we can at least express it as the sum of two commutators:

We can rewrite this somewhat strangely as

and hence there exists a bounded operator such that

Moving now to the Banach algebra of matrices with entries in (which can be equivalently viewed as ), a short computation then gives the identity

for some bounded operator whose exact form will not be relevant for the argument. Now, by Neumann series (and the fact that have unit operator norm), we can find another bounded operator such that

and then another brief computation shows that

Thus we can express the operator as the commutator of two operators of norm . Conjugating by for any , we may then express as the commutator of two operators of norm . This shows that the right-hand side of (3) cannot be replaced with anything that blows up faster than as . Can one improve this bound further?

]]>In a similar fashion, the fundamental object of study in complex differential geometry are the complex manifolds, in which the model space is rather than , and the transition maps are required to be holomorphic (and not merely smooth or continuous). In the real case, the one-dimensional manifolds (curves) are quite simple to understand, particularly if one requires the manifold to be connected; for instance, all compact connected one-dimensional real manifolds are homeomorphic to the unit circle (why?). However, in the complex case, the connected one-dimensional manifolds – the ones that look locally like subsets of – are much richer, and are known as Riemann surfaces. For sake of completeness we give the (somewhat lengthy) formal definition:

Definition 1 (Riemann surface)If is a Hausdorff connected topological space, a (one-dimensional complex) atlas is a collection of homeomorphisms from open subsets of that cover to open subsets of the complex numbers , such that the transition maps defined by are all holomorphic. Here is an arbitrary index set. Two atlases , on are said to beequivalentif their union is also an atlas, thus the transition maps and their inverses are all holomorphic. A Riemann surface is a Hausdorff connected topological space equipped with an equivalence class of one-dimensional complex atlases.A map from one Riemann surface to another is

holomorphicif the maps are holomorphic for any charts , of an atlas of and respectively; it is not hard to see that this definition does not depend on the choice of atlas. It is also clear that the composition of two holomorphic maps is holomorphic (and in fact the class of Riemann surfaces with their holomorphic maps forms a category).

Here are some basic examples of Riemann surfaces.

Example 2 (Quotients of )The complex numbers clearly form a Riemann surface (using the identity map as the single chart for an atlas). Of course, maps that are holomorphic in the usual sense will also be holomorphic in the sense of the above definition, and vice versa, so the notion of holomorphicity for Riemann surfaces is compatible with that of holomorphicity for complex maps. More generally, given any discrete additive subgroup of , the quotient is a Riemann surface. There are an infinite number of possible atlases to use here; one such is to pick a sufficiently small neighbourhood of the origin in and take the atlas where and for all . In particular, given any non-real complex number , the complex torus formed by quotienting by the lattice is a Riemann surface.

Example 3Any open connected subset of is a Riemann surface. By the Riemann mapping theorem, all simply connected open , other than itself, are isomorphic (as Riemann surfaces) to the unit disk (or, equivalently, to the upper half-plane).

Example 4 (Riemann sphere)The Riemann sphere , as a topological manifold, is the one-point compactification of . Topologically, this is a sphere and is in particular connected. One can cover the Riemann sphere by the two open sets and , and give these two open sets the charts and defined by for , for , and . This is a complex atlas since the is holomorphic on .An alternate way of viewing the Riemann sphere is as the projective line . Topologically, this is the punctured complex plane quotiented out by non-zero complex dilations, thus elements of this space are equivalence classes with the usual quotient topology. One can cover this space by two open sets and and give these two open sets the charts and defined by for , . This is a complex atlas, basically because for and is holomorphic on .

Exercise 5Verify that the Riemann sphere is isomorphic (as a Riemann surface) to the projective line.

Example 6 (Smooth algebraic plane curves)Let be a complex polynomial in three variables which is homogeneous of some degree , thus

Define the complex projective plane to be the punctured space quotiented out by non-zero complex dilations, with the usual quotient topology. (There is another important topology to place here of fundamental importance in algebraic geometry, namely the Zariski topology, but we will ignore this topology here.) This is a compact space, whose elements are equivalence classes . Inside this plane we can define the (projective, degree ) algebraic curve

this is well defined thanks to (1). It is easy to verify that is a closed subset of and hence compact; it is non-empty thanks to the fundamental theorem of algebra.

Suppose that is

irreducible, which means that it is not the product of polynomials of smaller degree. As we shall show in the appendix, this makes the algebraic curve connected. (Actually, algebraic curves remain connected even in the reducible case, thanks to Bezout’s theorem, but we will not prove that theorem here.) We will in fact make the strongernonsingularityhypothesis: there is no triple such that the four numbers simultaneously vanish for . (This looks like four constraints, but is in fact essentially just three, due to the Euler identitythat arises from differentiating (1) in . The fact that nonsingularity implies irreducibility is another consequence of Bezout’s theorem, which is not proven here.) For instance, the polynomial is irreducible but singular (there is a “cusp” singularity at ). With this hypothesis, we call the curve

smooth.Now suppose is a point in ; without loss of generality we may take non-zero, and then we can normalise . Now one can think of as an inhomogeneous polynomial in just two variables , and by nondegeneracy we see that the gradient is non-zero whenever . By the (complexified) implicit function theorem, this ensures that the

affine algebraic curveis a Riemann surface in a neighbourhood of ; we leave this as an exercise. This can be used to give a coordinate chart for in a neighbourhood of when . Similarly when is non-zero. This can be shown to give an atlas on , which (assuming the connectedness claim that we will prove later) gives the structure of a Riemann surface.

Exercise 7State and prove a complex version of the implicit function theorem that justifies the above claim that the charts in the above example form an atlas, and an algebraic curve associated to a non-singular polynomial is a Riemann surface.

Exercise 8

- (i) Show that all (irreducible plane projective) algebraic curves of degree are isomorphic to the Riemann sphere. (Hint: reduce to an explicit linear polynomial such as .)
- (ii) Show that all (irreducible plane projective) algebraic curves of degree are isomorphic to the Riemann sphere. (Hint: to reduce computation, first use some linear algebra to reduce the homogeneous quadratic polynomial to a standard form, such as or .)

Exercise 9If are complex numbers, show that the projective cubic curveis nonsingular if and only if the discriminant is non-zero. (When this occurs, the curve is called an elliptic curve (in Weierstrass form), which is a fundamentally important example of a Riemann surface in many areas of mathematics, and number theory in particular. One can also define the discriminant for polynomials of higher degree, but we will not do so here.)

A recurring theme in mathematics is that an object is often best studied by understanding spaces of “good” functions on . In complex analysis, there are two basic types of good functions:

Definition 10Let be a Riemann surface. Aholomorphic functionon is a holomorphic map from to ; the space of all such functions will be denoted . Ameromorphic functionon is a holomorphic map from to the Riemann sphere , that is not identically equal to ; the space of all such functions will be denoted .

One can also define holomorphicity and meromorphicity in terms of charts: a function is holomorphic if and only if, for any chart , the map is holomorphic in the usual complex analysis sense; similarly, a function is meromorphic if and only if the preimage is discrete (otherwise, by analytic continuation and the connectedness of , will be identically equal to ) and for any chart , the map becomes a meromorphic function in the usual complex analysis sense, after removing the discrete set of complex numbers where this map is infinite. One consequence of this alternate definition is that the space of holomorphic functions is a commutative complex algebra (a complex vector space closed under pointwise multiplication), while the space of meromorphic functions is a complex field (a commutative complex algebra where every non-zero element has an inverse). Another consequence is that one can define the notion of a zero of given order , or a pole of order , for a holomorphic or meromorphic function, by composing with a chart map and using the usual complex analysis notions there, noting (from the holomorphicity of transition maps and their inverses) that this does not depend on the choice of chart. (However, one cannot similarly define the residue of a meromorphic function on this way, as the residue turns out to be chart-dependent thanks to the chain rule. Residues should instead be applied to meromorphic -forms, a concept we will introduce later.) A third consequence is analytic continuation: if two holomorphic or meromorphic functions on agree on a non-empty open set, then they agree everywhere.

On the complex numbers , there are of course many holomorphic functions and meromorphic functions; for instance any power series with an infinite radius of convergence will give a holomorphic function, and the quotient of any two such functions (with non-zero denominator) will give a meromorphic function. Furthermore, we have extremely wide latitude in how to specify the zeroes of the holomorphic function, or the zeroes and poles of the meromorphic function, thanks to tools such as the Weierstrass factorisation theorem or the Mittag-Leffler theorem (covered in previous quarters).

It turns out, however, that the situation changes dramatically when the Riemann surface is *compact*, with the holomorphic and meromorphic functions becoming much more rigid. First of all, compactness eliminates all holomorphic functions except for the constants:

Lemma 11Let be a holomorphic function on a compact Riemann surface . Then is constant.

This result should be seen as a close sibling of Liouville’s theorem that all bounded entire functions are constant. (Indeed, in the case of a complex torus, this lemma is a corollary of Liouville’s theorem.)

*Proof:* As is continuous and is compact, must attain a maximum at some point . Working in a chart around and applying the maximum principle, we conclude that is constant in a neighbourhood of , and hence is constant everywhere by analytic continuation.

This dramatically cuts down the number of possible meromorphic functions – indeed, for an abstract Riemann surface, it is not immediately obvious that there are any non-constant meromorphic functions at all! As the poles are isolated and the surface is compact, a meromorphic function can only have finitely many poles, and if one prescribes the location of the poles and the maximum order at each pole, then we shall see that the space of meromorphic functions is now finite dimensional. The precise dimensions of these spaces are in fact rather interesting, and obey a basic duality law known as the Riemann-Roch theorem. We will give a mostly self-contained proof of the Riemann-Roch theorem in these notes, omitting only some facts about genus and Euler characteristic, as well as construction of certain meromorphic -forms (also known as Abelian differentials).

** — 1. Divisors — **

To discuss the zeroes and poles of meromorphic functions, it is convenient to introduce an abstraction of the concept of “a collection of zeroes and poles”, known as a divisor.

Definition 12 (Divisor)Let be a compact Riemann surface. Adivisoron is a formal integer linear combination , where ranges over a finite collection of points in , and are integers, with the obvious additive group structure; equivalently, the space of divisors is the free abelian group with generators with (where we make the usual convention ). The number is thedegreeof the divisor; we call each theorderof the divisor at , with the convention that the order is zero for points not appearing in the sum. A divisor isnon-negative(oreffective) if all the are non-negative, and we partially order the divisors by writing if is non-negative. This makes a lattice, so we can define the maximum or minimum of two divisors. Given a non-zero meromorphic function , theprincipal divisorassociated to is the divisor , where ranges over the zeroes and poles of , and is the order of zero (or negative the order of pole) at . (Note that as zeroes and poles are isolated, and is compact, the number of zeroes and poles is automatically finite.)

Informally, one should think of as the abstraction of a zero of order at , or a pole of order if is negative.

Example 13Consider a rational functionfor some non-zero complex number and some complex numbers . This is a meromorphic function on , and is also meromorphic, so extends to a meromorphic function on the Riemann sphere . It has zeroes at and poles at , and also has a zero of order (or a pole of order ) at , as can be seen by inspection of near the origin (or the growth of near infinity), and thus

In particular, has degree zero.

Exercise 14Show that all meromorphic functions on the Riemann sphere come from rational functions as in the above example. In particular, every principal divisor on the Riemann sphere has degree zero. Give an alternate proof of this latter fact using the residue theorem. (We will generalise this fact to other Riemann surfaces shortly; see Proposition 24.)

It is easy to see (by working in a coordinate chart around ) that if are non-zero meromorphic functions, that one has the valuation axioms

for any (adopting the convention the zero function has order everywhere); thus we have

again adopting the convention that is larger than every divisor. In particular, the space of principal divisors of is a subgroup of . We call two divisors *linearly equivalent* if they differ by a principal divisor; this is clearly an equivalence relation.

The properties (2) have the following consequence. Given a divisor , let be the space of all meromorphic functions such that (including, by convention, the zero function ); thus, if , then consists of functions that have at worst a pole of order at (or a zero of order or greater, if is negative). For instance, is the space of meromorphic functions that have at most a double pole at , a single pole at , and at least a simple zero at , if are distinct points in . From (2) (and the fact that non-zero constant functions have principal divisor zero) we see that each is a vector space. We clearly have the nesting properties if , and also if then .

Remark 15In the language of vector bundles, one can identify a divisor with a certain holomorphic line bundle on , and can be identified with the space of sections of this bundle. This is arguably the more natural way to think about divisors; however, we will not adopt this language here.

If and , then is holomorphic on and hence (by Lemma 11) constant. We can thus easily compute for zero or negative divisors:

Corollary 16Let be a compact Riemann surface. Then consists only of the constant functions, and consists only of if . In particular, has dimension when and when .

Exercise 17If and are principal divisors with , show that is a constant multiple of with .

Exercise 18Let be a divisor. Show that if and only if is linearly equivalent to an effective divisor.

The situation for (i.e., has positive order at at least one point) is more interesting. We first have a simple observation from linear algebra:

Lemma 19Let be a compact Riemann surface, be a divisor, and be a point. Then has codimension at most in .

*Proof:* Let be a chart that maps to the origin, and suppose that already had order at (so that had order ). Then functions , when composed with the inverse