in terms of the cardinality of ? A trivial upper bound would be , since this is the number of possible pairs , and clearly determine . In our paper, we establish the improved bound
where is the somewhat strange exponent
so that . Furthermore, this exponent is best possible!
Actually, the latter claim is quite easy to show: one takes to be all the subsets of of cardinality either or , for a multiple of , and the claim follows readily from Stirling’s formula. So it is perhaps the former claim that is more interesting (since many combinatorial proof techniques, such as those based on inequalities such as the Cauchy-Schwarz inequality, tend to produce exponents that are rational or at least algebraic). We follow the common, though unintuitive, trick of generalising a problem to make it simpler. Firstly, one generalises the bound to the “trilinear” bound
for arbitrary finite collections of sets. One can place all the sets in inside a single finite set such as , and then by replacing every set in by its complement in , one can phrase the inequality in the equivalent form
for arbitrary collections of subsets of . We generalise further by turning sets into functions, replacing the estimate with the slightly stronger convolution estimate
for arbitrary functions on the Hamming cube , where the convolution is on the integer lattice rather than on the finite field vector space . The advantage of working in this general setting is that it becomes very easy to apply induction on the dimension ; indeed, to prove this estimate for arbitrary it suffices to do so for . This reduces matters to establishing the elementary inequality
for all , which can be done by a combination of undergraduate multivariable calculus and a little bit of numerical computation. (The left-hand side turns out to have local maxima at , with the latter being the cause of the numerology (1).)
The same sort of argument also gives an energy bound
for any subset of the Hamming cube, where
is the additive energy of . The example shows that the exponent cannot be improved.
This is not one of the 1%.
Mathematical research is clearly an international activity. But actually a stronger claim is true: mathematical research is a transnational activity, in that the specific nationality of individual members of a research team or research community are (or should be) of no appreciable significance for the purpose of advancing mathematics. For instance, even during the height of the Cold War, there was no movement in (say) the United States to boycott Soviet mathematicians or theorems, or to only use results from Western literature (though the latter did sometimes happen by default, due to the limited avenues of information exchange between East and West, and former did occasionally occur for political reasons, most notably with the Soviet Union preventing Gregory Margulis from traveling to receive his Fields Medal in 1978 EDIT: and also Sergei Novikov in 1970). The national origin of even the most fundamental components of mathematics, whether it be the geometry (γεωμετρία) of the ancient Greeks, the algebra (الجبر) of the Islamic world, or the Hindu-Arabic numerals , are primarily of historical interest, and have only a negligible impact on the worldwide adoption of these mathematical tools. While it is true that individual mathematicians or research teams sometimes compete with each other to be the first to solve some desired problem, and that a citizen could take pride in the mathematical achievements of researchers from their country, one did not see any significant state-sponsored “space races” in which it was deemed in the national interest that a particular result ought to be proven by “our” mathematicians and not “theirs”. Mathematical research ability is highly non-fungible, and the value added by foreign students and faculty to a mathematics department cannot be completely replaced by an equivalent amount of domestic students and faculty, no matter how large and well educated the country (though a state can certainly work at the margins to encourage and support more domestic mathematicians). It is no coincidence that all of the top mathematics department worldwide actively recruit the best mathematicians regardless of national origin, and often retain immigration counsel to assist with situations in which these mathematicians come from a country that is currently politically disfavoured by their own.
Of course, mathematicians cannot ignore the political realities of the modern international order altogether. Anyone who has organised an international conference or program knows that there will inevitably be visa issues to resolve because the host country makes it particularly difficult for certain nationals to attend the event. I myself, like many other academics working long-term in the United States, have certainly experienced my own share of immigration bureaucracy, starting with various glitches in the renewal or application of my J-1 and O-1 visas, then to the lengthy vetting process for acquiring permanent residency (or “green card”) status, and finally to becoming naturalised as a US citizen (retaining dual citizenship with Australia). Nevertheless, while the process could be slow and frustrating, there was at least an order to it. The rules of the game were complicated, but were known in advance, and did not abruptly change in the middle of playing it (save in truly exceptional situations, such as the days after the September 11 terrorist attacks). One just had to study the relevant visa regulations (or hire an immigration lawyer to do so), fill out the paperwork and submit to the relevant background checks, and remain in good standing until the application was approved in order to study, work, or participate in a mathematical activity held in another country. On rare occasion, some senior university administrator may have had to contact a high-ranking government official to approve some particularly complicated application, but for the most part one could work through normal channels in order to ensure for instance that the majority of participants of a conference could actually be physically present at that conference, or that an excellent mathematician hired by unanimous consent by a mathematics department could in fact legally work in that department.
With the recent and highly publicised executive order on immigration, many of these fundamental assumptions have been seriously damaged, if not destroyed altogether. Even if the order was withdrawn immediately, there is no longer an assurance, even for nationals not initially impacted by that order, that some similar abrupt and major change in the rules for entry to the United States could not occur, for instance for a visitor who has already gone through the lengthy visa application process and background checks, secured the appropriate visa, and is already in flight to the country. This is already affecting upcoming or ongoing mathematical conferences or programs in the US, with many international speakers (including those from countries not directly affected by the order) now cancelling their visit, either in protest or in concern about their ability to freely enter and leave the country. Even some conferences outside the US are affected, as some mathematicians currently in the US with a valid visa or even permanent residency are uncertain if they could ever return back to their place of work if they left the country to attend a meeting. In the slightly longer term, it is likely that the ability of elite US institutions to attract the best students and faculty will be seriously impacted. Again, the losses would be strongest regarding candidates that were nationals of the countries affected by the current executive order, but I fear that many other mathematicians from other countries would now be much more concerned about entering and living in the US than they would have previously.
It is still possible for this sort of long-term damage to the mathematical community (both within the US and abroad) to be reversed or at least contained, but at present there is a real risk of the damage becoming permanent. To prevent this, it seems insufficient for me for the current order to be rescinded, as desirable as that would be; some further legislative or judicial action would be needed to begin restoring enough trust in the stability of the US immigration and visa system that the international travel that is so necessary to modern mathematical research becomes “just” a bureaucratic headache again.
Of course, the impact of this executive order is far, far broader than just its effect on mathematicians and mathematical research. But there are countless other venues on the internet and elsewhere to discuss these other aspects (or politics in general). (For instance, discussion of the qualifications, or lack thereof, of the current US president can be carried out at this previous post.) I would therefore like to open this post to readers to discuss the effects or potential effects of this order on the mathematical community; I particularly encourage mathematicians who have been personally affected by this order to share their experiences. As per the rules of the blog, I request that “the discussions are kept constructive, polite, and at least tangentially relevant to the topic at hand”.
Some relevant links (please feel free to suggest more, either through comments or by email):
One can rephrase the lonely runner conjecture as the following covering problem. Given any integer “velocity” and radius , define the Bohr set to be the subset of the unit circle given by the formula
where denotes the distance of to the nearest integer. Thus, for positive, is simply the union of the intervals for , projected onto the unit circle ; in the language of the usual formulation of the lonely runner conjecture, represents those times in which a runner moving at speed returns to within of his or her starting position. For any non-zero integers , let be the smallest radius such that the Bohr sets cover the unit circle:
Then define to be the smallest value of , as ranges over tuples of distinct non-zero integers. The Dirichlet approximation theorem quickly gives that
and hence
for any . The lonely runner conjecture is equivalent to the assertion that this bound is in fact optimal:
Conjecture 1 (Lonely runner conjecture) For any , one has .
This conjecture is currently known for (see this paper of Barajas and Serra), but remains open for higher .
It is natural to try to attack the problem by establishing lower bounds on the quantity . We have the following “trivial” bound, that gets within a factor of two of the conjecture:
Proposition 2 (Trivial bound) For any , one has .
Proof: It is not difficult to see that for any non-zero velocity and any , the Bohr set has Lebesgue measure . In particular, by the union bound
we see that the covering (1) is only possible if , giving the claim.
So, in some sense, all the difficulty is coming from the need to improve upon the trivial union bound (2) by a factor of two.
Despite the crudeness of the union bound (2), it has proven surprisingly hard to make substantial improvements on the trivial bound . In 1994, Chen obtained the slight improvement
which was improved a little by Chen and Cusick in 1999 to
when was prime. In a recent paper of Perarnau and Serra, the bound
was obtained for arbitrary . These bounds only improve upon the trivial bound by a multiplicative factor of . Heuristically, one reason for this is as follows. The union bound (2) would of course be sharp if the Bohr sets were all disjoint. Strictly speaking, such disjointness is not possible, because all the Bohr sets have to contain the origin as an interior point. However, it is possible to come up with a large number of Bohr sets which are almost disjoint. For instance, suppose that we had velocities that were all prime numbers between and , and that was equal to (and in particular was between and . Then each set can be split into a “kernel” interval , together with the “petal” intervals . Roughly speaking, as the prime varies, the kernel interval stays more or less fixed, but the petal intervals range over disjoint sets, and from this it is not difficult to show that
so that the union bound is within a multiplicative factor of of the truth in this case.
This does not imply that is within a multiplicative factor of of , though, because there are not enough primes between and to assign to distinct velocities; indeed, by the prime number theorem, there are only about such velocities that could be assigned to a prime. So, while the union bound could be close to tight for up to Bohr sets, the above counterexamples don’t exclude improvements to the union bound for larger collections of Bohr sets. Following this train of thought, I was able to obtain a logarithmic improvement to previous lower bounds:
Theorem 3 For sufficiently large , one has for some absolute constant .
The factors of in the denominator are for technical reasons and might perhaps be removable by a more careful argument. However it seems difficult to adapt the methods to improve the in the numerator, basically because of the obstruction provided by the near-counterexample discussed above.
Roughly speaking, the idea of the proof of this theorem is as follows. If we have the covering (1) for very close to , then the multiplicity function will then be mostly equal to , but occasionally be larger than . On the other hand, one can compute that the norm of this multiplicity function is significantly larger than (in fact it is at least ). Because of this, the norm must be very large, which means that the triple intersections must be quite large for many triples . Using some basic Fourier analysis and additive combinatorics, one can deduce from this that the velocities must have a large structured component, in the sense that there exists an arithmetic progression of length that contains of these velocities. For simplicity let us take the arithmetic progression to be , thus of the velocities lie in . In particular, from the prime number theorem, most of these velocities will not be prime, and will in fact likely have a “medium-sized” prime factor (in the precise form of the argument, “medium-sized” is defined to be “between and “). Using these medium-sized prime factors, one can show that many of the will have quite a large overlap with many of the other , and this can be used after some elementary arguments to obtain a more noticeable improvement on the union bound (2) than was obtained previously.
A modification of the above argument also allows for the improved estimate
if one knows that all of the velocities are of size .
In my previous blog post, I showed that in order to prove the lonely runner conjecture, it suffices to do so under the additional assumption that all of the velocities are of size ; I reproduce this argument (slightly cleaned up for publication) in the current preprint. There is unfortunately a huge gap between and , so the above bound (3) does not immediately give any new bounds for . However, one could perhaps try to start attacking the lonely runner conjecture by increasing the range for which one has good results, and by decreasing the range that one can reduce to. For instance, in the current preprint I give an elementary argument (using a certain amount of case-checking) that shows that the lonely runner bound
holds if all the velocities are assumed to lie between and . This upper threshold of is only a tiny improvement over the trivial threshold of , but it seems to be an interesting sub-problem of the lonely runner conjecture to increase this threshold further. One key target would be to get up to , as there are actually a number of -tuples in this range for which (4) holds with equality. The Dirichlet approximation theorem of course gives the tuple , but there is also the double of this tuple, and furthermore there is an additional construction of Goddyn and Wong that gives some further examples such as , or more generally one can start with the standard tuple and accelerate one of the velocities to ; this turns out to work as long as shares a common factor with every integer between and . There are a few more examples of this type in the paper of Goddyn and Wong, but all of them can be placed in an arithmetic progression of length at most, so if one were very optimistic, one could perhaps envision a strategy in which the upper bound of mentioned earlier was reduced all the way to something like , and then a separate argument deployed to treat this remaining case, perhaps isolating the constructions of Goddyn and Wong (and possible variants thereof) as the only extreme cases.
[Update, Dec 22: my own notes are now on the repository.]
where is now a system of scalar fields, is a potential which is strictly positive and homogeneous of degree (and invariant under phase rotations ), and is a smooth compactly supported forcing term, needed for technical reasons.
To oversimplify somewhat, the equation (1) is known to be globally regular in the energy-subcritical case when , or when and ; global regularity is also known (but is significantly more difficult to establish) in the energy-critical case when and . (This is an oversimplification for a number of reasons, in particular in higher dimensions one only knows global well-posedness instead of global regularity. See this previous post for some exploration of this issue in the context of nonlinear wave equations.) The main result of this paper is to show that global regularity can break down in the remaining energy-supercritical case when and , at least when the target dimension is allowed to be sufficiently large depending on the spatial dimension (I did not try to achieve the optimal value of here, but the argument gives a value of that grows quadratically in ). Unfortunately, this result does not directly impact the most interesting case of the defocusing scalar NLS equation
in which ; however it does establish a rigorous barrier to any attempt to prove global regularity for the scalar NLS equation, in that such an attempt needs to crucially use some property of the scalar NLS that is not shared by the more general systems in (1). For instance, any approach that is primarily based on the conservation laws of mass, momentum, and energy (which are common to both (1) and (2)) will not be sufficient to establish global regularity of supercritical defocusing scalar NLS.
The method of proof in this paper is broadly similar to that in the previous paper for NLW, but with a number of additional technical complications. Both proofs begin by reducing matters to constructing a discretely self-similar solution. In the case of NLW, this solution lived on a forward light cone and obeyed a self-similarity
The ability to restrict to a light cone arose from the finite speed of propagation properties of NLW. For NLS, the solution will instead live on the domain
and obey a parabolic self-similarity
and solve the homogeneous version of (1). (The inhomogeneity emerges when one truncates the self-similar solution so that the initial data is compactly supported in space.) A key technical point is that has to be smooth everywhere in , including the boundary component . This unfortunately rules out many of the existing constructions of self-similar solutions, which typically will have some sort of singularity at the spatial origin.
The remaining steps of the argument can broadly be described as quantifier elimination: one systematically eliminates each of the degrees of freedom of the problem in turn by locating the necessary and sufficient conditions required of the remaining degrees of freedom in order for the constraints of a particular degree of freedom to be satisfiable. The first such degree of freedom to eliminate is the potential function . The task here is to determine what constraints must exist on a putative solution in order for there to exist a (positive, homogeneous, smooth away from origin) potential obeying the homogeneous NLS equation
Firstly, the requirement that be homogeneous implies the Euler identity
(where denotes the standard real inner product on ), while the requirement that be phase invariant similarly yields the variant identity
so if one defines the potential energy field to be , we obtain from the chain rule the equations
Conversely, it turns out (roughly speaking) that if one can locate fields and obeying the above equations (as well as some other technical regularity and non-degeneracy conditions), then one can find an with all the required properties. The first of these equations can be thought of as a definition of the potential energy field , and the other three equations are basically disguised versions of the conservation laws of mass, energy, and momentum respectively. The construction of relies on a classical extension theorem of Seeley that is a relative of the Whitney extension theorem.
Now that the potential is eliminated, the next degree of freedom to eliminate is the solution field . One can observe that the above equations involving and can be expressed instead in terms of and the Gram-type matrix of , which is a matrix consisting of the inner products where range amongst the differential operators
To eliminate , one thus needs to answer the question of what properties are required of a matrix for it to be the Gram-type matrix of a field . Amongst some obvious necessary conditions are that needs to be symmetric and positive semi-definite; there are also additional constraints coming from identities such as
and
Ideally one would like a theorem that asserts (for large enough) that as long as obeys all of the “obvious” constraints, then there exists a suitably non-degenerate map such that . In the case of NLW, the analogous claim was basically a consequence of the Nash embedding theorem (which can be viewed as a theorem about the solvability of the system of equations for a given positive definite symmetric set of fields ). However, the presence of the complex structure in the NLS case poses some significant technical challenges (note for instance that the naive complex version of the Nash embedding theorem is false, due to obstructions such as Liouville’s theorem that prevent a compact complex manifold from being embeddable holomorphically in ). Nevertheless, by adapting the proof of the Nash embedding theorem (in particular, the simplified proof of Gunther that avoids the need to use the Nash-Moser iteration scheme) we were able to obtain a partial complex analogue of the Nash embedding theorem that sufficed for our application; it required an artificial additional “curl-free” hypothesis on the Gram-type matrix , but fortunately this hypothesis ends up being automatic in our construction. Also, this version of the Nash embedding theorem is unable to prescribe the component of the Gram-type matrix , but fortunately this component is not used in any of the conservation laws and so the loss of this component does not cause any difficulty.
After applying the above-mentioned Nash-embedding theorem, the task is now to locate a matrix obeying all the hypotheses of that theorem, as well as the conservation laws for mass, momentum, and energy (after defining the potential energy field in terms of ). This is quite a lot of fields and constraints, but one can cut down significantly on the degrees of freedom by requiring that is spherically symmetric (in a tensorial sense) and also continuously self-similar (not just discretely self-similar). Note that this hypothesis is weaker than the assertion that the original field is spherically symmetric and continuously self-similar; indeed we do not know if non-trivial solutions of this type actually exist. These symmetry hypotheses reduce the number of independent components of the matrix to just six: , which now take as their domain the -dimensional space
One now has to construct these six fields, together with a potential energy field , that obey a number of constraints, notably some positive definiteness constraints as well as the aforementioned conservation laws for mass, momentum, and energy.
The field only arises in the equation for the potential (coming from Euler’s identity) and can easily be eliminated. Similarly, the field only makes an appearance in the current of the energy conservation law, and so can also be easily eliminated so long as the total energy is conserved. But in the energy-supercritical case, the total energy is infinite, and so it is relatively easy to eliminate the field from the problem also. This leaves us with the task of constructing just five fields obeying a number of positivity conditions, symmetry conditions, regularity conditions, and conservation laws for mass and momentum.
The potential field can effectively be absorbed into the angular stress field (after placing an appropriate counterbalancing term in the radial stress field so as not to disrupt the conservation laws), so we can also eliminate this field. The angular stress field is then only constrained through the momentum conservation law and a requirement of positivity; one can then eliminate this field by converting the momentum conservation law from an equality to an inequality. Finally, the radial stress field is also only constrained through a positive definiteness constraint and the momentum conservation inequality, so it can also be eliminated from the problem after some further modification of the momentum conservation inequality.
The task then reduces to locating just two fields that obey a mass conservation law
together with an additional inequality that is the remnant of the momentum conservation law. One can solve for the mass conservation law in terms of a single scalar field using the ansatz
so the problem has finally been simplified to the task of locating a single scalar field with some scaling and homogeneity properties that obeys a certain differential inequality relating to momentum conservation. This turns out to be possible by explicitly writing down a specific scalar field using some asymptotic parameters and cutoff functions.
The Chern Medal is a relatively new prize, awarded once every four years jointly by the IMU
and the Chern Medal Foundation (CMF) to an individual whose accomplishments warrant
the highest level of recognition for outstanding achievements in the field of mathematics.
Funded by the CMF, the Medalist receives a cash prize of US$ 250,000. In addition, each
Medalist may nominate one or more organizations to receive funding totalling US$ 250,000, for the support of research, education, or other outreach programs in the field of mathematics.
Professor Chern devoted his life to mathematics, both in active research and education, and in nurturing the field whenever the opportunity arose. He obtained fundamental results in all the major aspects of modern geometry and founded the area of global differential geometry. Chern exhibited keen aesthetic tastes in his selection of problems, and the breadth of his work deepened the connections of geometry with different areas of mathematics. He was also generous during his lifetime in his personal support of the field.
Nominations should be sent to the Prize Committee Chair: Caroline Series, email: chair@chern18.mathunion.org by 31st December 2016. Further details and nomination guidelines for this and the other IMU prizes can be found at http://www.mathunion.org/general/prizes/
Conjecture 1 (Toeplitz square peg problem) Let be a simple closed curve in the plane. Is it necessarily the case that contains four vertices of a square?
See this recent survey of Matschke in the Notices of the AMS for the latest results on this problem.
The route I took to the results in this paper was somewhat convoluted. I was motivated to look at this problem after lecturing recently on the Jordan curve theorem in my class. The problem is superficially similar to the Jordan curve theorem in that the result is known (and rather easy to prove) if is sufficiently regular (e.g. if it is a polygonal path), but seems to be significantly more difficult when the curve is merely assumed to be continuous. Roughly speaking, all the known positive results on the problem have proceeded using (in some form or another) tools from homology: note for instance that one can view the conjecture as asking whether the four-dimensional subset of the eight-dimensional space necessarily intersects the four-dimensional space consisting of the quadruples traversing a square in (say) anti-clockwise order; this space is a four-dimensional linear subspace of , with a two-dimensional subspace of “degenerate” squares removed. If one ignores this degenerate subspace, one can use intersection theory to conclude (under reasonable “transversality” hypotheses) that intersects an odd number of times (up to the cyclic symmetries of the square), which is basically how Conjecture 1 is proven in the regular case. Unfortunately, if one then takes a limit and considers what happens when is just a continuous curve, the odd number of squares created by these homological arguments could conceivably all degenerate to points, thus blocking one from proving the conjecture in the general case.
Inspired by my previous work on finite time blowup for various PDEs, I first tried looking for a counterexample in the category of (locally) self-similar curves that are smooth (or piecewise linear) away from a single origin where it can oscillate infinitely often; this is basically the smoothest type of curve that was not already covered by previous results. By a rescaling and compactness argument, it is not difficult to see that such a counterexample would exist if there was a counterexample to the following periodic version of the conjecture:
Conjecture 2 (Periodic square peg problem) Let be two disjoint simple closed piecewise linear curves in the cylinder which have a winding number of one, that is to say they are homologous to the loop from to . Then the union of and contains the four vertices of a square.
In contrast to Conjecture 1, which is known for polygonal paths, Conjecture 2 is still open even under the hypothesis of polygonal paths; the homological arguments alluded to previously now show that the number of inscribed squares in the periodic setting is even rather than odd, which is not enough to conclude the conjecture. (This flipping of parity from odd to even due to an infinite amount of oscillation is reminiscent of the “Eilenberg-Mazur swindle“, discussed in this previous post.)
I therefore tried to construct counterexamples to Conjecture 2. I began perturbatively, looking at curves that were small perturbations of constant functions. After some initial Taylor expansion, I was blocked from forming such a counterexample because an inspection of the leading Taylor coefficients required one to construct a continuous periodic function of mean zero that never vanished, which of course was impossible by the intermediate value theorem. I kept expanding to higher and higher order to try to evade this obstruction (this, incidentally, was when I discovered this cute application of Lagrange reversion) but no matter how high an accuracy I went (I think I ended up expanding to sixth order in a perturbative parameter before figuring out what was going on!), this obstruction kept resurfacing again and again. I eventually figured out that this obstruction was being caused by a “conserved integral of motion” for both Conjecture 2 and Conjecture 1, which can in fact be used to largely rule out perturbative constructions. This yielded a new positive result for both conjectures:
We sketch the proof of Theorem 3(i) as follows (the proof of Theorem 3(ii) is very similar). Let be the curve , thus traverses one of the two graphs that comprise . For each time , there is a unique square with first vertex (and the other three vertices, traversed in anticlockwise order, denoted ) such that also lies in the graph of and also lies in the graph of (actually for technical reasons we have to extend by constants to all of in order for this claim to be true). To see this, we simply rotate the graph of clockwise by around , where (by the Lipschitz hypotheses) it must hit the graph of in a unique point, which is , and which then determines the other two vertices of the square. The curve has the same starting and ending point as the graph of or ; using the Lipschitz hypothesis one can show this graph is simple. If the curve ever hits the graph of other than at the endpoints, we have created an inscribed square, so we may assume for contradiction that avoids the graph of , and hence by the Jordan curve theorem the two curves enclose some non-empty bounded open region .
Now for the conserved integral of motion. If we integrate the -form on each of the four curves , we obtain the identity
This identity can be established by the following calculation: one can parameterise
for some Lipschitz functions ; thus for instance . Inserting these parameterisations and doing some canceling, one can write the above integral as
which vanishes because (which represent the sidelengths of the squares determined by vanish at the endpoints .
Using this conserved integral of motion, one can show that
which by Stokes’ theorem then implies that the bounded open region mentioned previously has zero area, which is absurd.
This argument hinged on the curve being simple, so that the Jordan curve theorem could apply. Once one left the perturbative regime of curves of small Lipschitz constant, it became possible for to be self-crossing, but nevertheless there still seemed to be some sort of integral obstruction. I eventually isolated the problem in the form of a strengthened version of Conjecture 2:
Conjecture 4 (Area formulation of square peg problem) Let be simple closed piecewise linear curves of winding number obeying the area identity
(note the -form is still well defined on the cylinder ; note also that the curves are allowed to cross each other.) Then there exists a (possibly degenerate) square with vertices (traversed in anticlockwise order) lying on respectively.
It is not difficult to see that Conjecture 4 implies Conjecture 2. Actually I believe that the converse implication is at least morally true, in that any counterexample to Conjecture 4 can be eventually transformed to a counterexample to Conjecture 2 and Conjecture 1. The conserved integral of motion argument can establish Conjecture 4 in many cases, for instance if are graphs of functions of Lipschitz constant less than one.
Conjecture 4 has a model special case, when one of the is assumed to just be a horizontal loop. In this case, the problem collapses to that of producing an intersection between two three-dimensional subsets of a six-dimensional space, rather than to four-dimensional subsets of an eight-dimensional space. More precisely, some elementary transformations reveal that this special case of Conjecture 4 can be formulated in the following fashion in which the geometric notion of a square is replaced by the additive notion of a triple of real numbers summing to zero:
Conjecture 5 (Special case of area formulation) Let be simple closed piecewise linear curves of winding number obeying the area identity
Then there exist and with such that for .
This conjecture is easy to establish if one of the curves, say , is the graph of some piecewise linear function , since in that case the curve and the curve enclose the same area in the sense that , and hence must intersect by the Jordan curve theorem (otherwise they would enclose a non-zero amount of area between them), giving the claim. But when none of the are graphs, the situation becomes combinatorially more complicated.
Using some elementary homological arguments (e.g. breaking up closed -cycles into closed paths) and working with a generic horizontal slice of the curves, I was able to show that Conjecture 5 was equivalent to a one-dimensional problem that was largely combinatorial in nature, revolving around the sign patterns of various triple sums with drawn from various finite sets of reals.
Conjecture 6 (Combinatorial form) Let be odd natural numbers, and for each , let be distinct real numbers; we adopt the convention that . Assume the following axioms:
- (i) For any , the sums are non-zero.
- (ii) (Non-crossing) For any and with the same parity, the pairs and are non-crossing in the sense that
- (iii) (Non-crossing sums) For any , , of the same parity, one has
Then one has
Roughly speaking, Conjecture 6 and Conjecture 5 are connected by constructing curves to connect to for by various paths, which either lie to the right of the axis (when is odd) or to the left of the axis (when is even). The axiom (ii) is asserting that the numbers are ordered according to the permutation of a meander (formed by gluing together two non-crossing perfect matchings).
Using various ad hoc arguments involving “winding numbers”, it is possible to prove this conjecture in many cases (e.g. if one of the is at most ), to the extent that I have now become confident that this conjecture is true (and have now come full circle from trying to disprove Conjecture 1 to now believing that this conjecture holds also). But it seems that there is some non-trivial combinatorial argument to be made if one is to prove this conjecture; purely homological arguments seem to partially resolve the problem, but are not sufficient by themselves.
While I was not able to resolve the square peg problem, I think these results do provide a roadmap to attacking it, first by focusing on the combinatorial conjecture in Conjecture 6 (or its equivalent form in Conjecture 5), then after that is resolved moving on to Conjecture 4, and then finally to Conjecture 1.
Here is a purely algebraic form of the problem:
Problem 1 Let be a formal function of one variable . Suppose that is the formal function defined by
where we use to denote the -fold derivative of with respect to the variable .
- (i) Show that can be formally recovered from by the formula
- (ii) There is a remarkable further formal identity relating with that does not explicitly involve any infinite summation. What is this identity?
To rigorously formulate part (i) of this problem, one could work in the commutative differential ring of formal infinite series generated by polynomial combinations of and its derivatives (with no constant term). Part (ii) is a bit trickier to formulate in this abstract ring; the identity in question is easier to state if are formal power series, or (even better) convergent power series, as it involves operations such as composition or inversion that can be more easily defined in those latter settings.
To illustrate Problem 1(i), let us compute up to third order in , using to denote any quantity involving four or more factors of and its derivatives, and similarly for other exponents than . Then we have
and hence
multiplying, we have
and
and hence after a lot of canceling
Thus Problem 1(i) holds up to errors of at least. In principle one can continue verifying Problem 1(i) to increasingly high order in , but the computations rapidly become quite lengthy, and I do not know of a direct way to ensure that one always obtains the required cancellation at the end of the computation.
Problem 1(i) can also be posed in formal power series: if
is a formal power series with no constant term with complex coefficients with , then one can verify that the series
makes sense as a formal power series with no constant term, thus
For instance it is not difficult to show that . If one further has , then it turns out that
as formal power series. Currently the only way I know how to show this is by first proving the claim for power series with a positive radius of convergence using the Cauchy integral formula, but even this is a bit tricky unless one has managed to guess the identity in (ii) first. (In fact, the way I discovered this problem was by first trying to solve (a variant of) the identity in (ii) by Taylor expansion in the course of attacking another problem, and obtaining the transform in Problem 1 as a consequence.)
The transform that takes to resembles both the exponential function
and Taylor’s formula
but does not seem to be directly connected to either (this is more apparent once one knows the identity in (ii)).
Lemma 1 (Holomorphicity and harmonicity are conformal invariants) Let be a complex diffeomorphism between two Riemann surfaces .
- (i) If is a function to another Riemann surface , then is holomorphic if and only if is holomorphic.
- (ii) If are open subsets of and is a function, then is harmonic if and only if is harmonic.
Proof: Part (i) is immediate since the composition of two holomorphic functions is holomorphic. For part (ii), observe that if is harmonic then on any ball in , is the real part of some holomorphic function thanks to Exercise 62 of Notes 3. By part (i), is also holomorphic. Taking real parts we see that is harmonic on each ball in , and hence harmonic on all of , giving one direction of (ii); the other direction is proven similarly.
Exercise 2 Establish Lemma 1(ii) by direct calculation, avoiding the use of holomorphic functions. (Hint: the calculations are cleanest if one uses Wirtinger derivatives, as per Exercise 27 of Notes 1.)
Exercise 3 Let be a complex diffeomorphism between two open subsets of , let be a point in , let be a natural number, and let be holomorphic. Show that has a zero (resp. a pole) of order at if and only if has a zero (resp. a pole) of order at .
From Lemma 1(ii) we can now define the notion of a harmonic function on a Riemann surface ; such a function is harmonic if, for every coordinate chart in some atlas, the map is harmonic. Lemma 1(ii) ensures that this definition of harmonicity does not depend on the choice of atlas. Similarly, using Exercise 3 one can define what it means for a holomorphic map on a Riemann surface to have a pole or zero of a given order at a point , with the definition being independent of the choice of atlas.
In view of Lemma 1, it is thus natural to ask which Riemann surfaces are complex diffeomorphic to each other, and more generally to understand the space of holomorphic maps from one given Riemann surface to another. We will initially focus attention on three important model Riemann surfaces:
The designation of these model Riemann surfaces as elliptic, parabolic, and hyperbolic comes from Riemannian geometry, where it is natural to endow each of these surfaces with a constant curvature Riemannian metric which is positive, zero, or negative in the elliptic, parabolic, and hyperbolic cases respectively. However, we will not discuss Riemannian geometry further here.
All three model Riemann surfaces are simply connected, but none of them are complex diffeomorphic to any other; indeed, there are no non-constant holomorphic maps from the Riemann sphere to the plane or the disk, nor are there any non-constant holomorphic maps from the plane to the disk (although there are plenty of holomorphic maps going in the opposite directions). The complex automorphisms (that is, the complex diffeomorphisms from a surface to itself) of each of the three surfaces can be classified explicitly. The automorphisms of the Riemann sphere turn out to be the Möbius transformations with , also known as fractional linear transformations. The automorphisms of the complex plane are the linear transformations with , and the automorphisms of the disk are the fractional linear transformations of the form for and . Holomorphic maps from the disk to itself that fix the origin obey a basic but incredibly important estimate known as the Schwarz lemma: they are “dominated” by the identity function in the sense that for all . Among other things, this lemma gives guidance to determine when a given Riemann surface is complex diffeomorphic to a disk; we shall discuss this point further below.
It is a beautiful and fundamental fact in complex analysis that these three model Riemann surfaces are in fact an exhaustive list of the simply connected Riemann surfaces, up to complex diffeomorphism. More precisely, we have the Riemann mapping theorem and the uniformisation theorem:
Theorem 4 (Riemann mapping theorem) Let be a simply connected open subset of that is not all of . Then is complex diffeomorphic to .
Theorem 5 (Uniformisation theorem) Let be a simply connected Riemann surface. Then is complex diffeomorphic to , , or .
As we shall see, every connected Riemann surface can be viewed as the quotient of its simply connected universal cover by a discrete group of automorphisms known as deck transformations. This in principle gives a complete classification of Riemann surfaces up to complex diffeomorphism, although the situation is still somewhat complicated in the hyperbolic case because of the wide variety of discrete groups of automorphisms available in that case.
We will prove the Riemann mapping theorem in these notes, using the elegant argument of Koebe that is based on the Schwarz lemma and Montel’s theorem (Exercise 57 of Notes 4). The uniformisation theorem is however more difficult to establish; we discuss some components of a proof (based on the Perron method of subharmonic functions) here, but stop short of providing a complete proof.
The above theorems show that it is in principle possible to conformally map various domains into model domains such as the unit disk, but the proofs of these theorems do not readily produce explicit conformal maps for this purpose. For some domains we can just write down a suitable such map. For instance:
Exercise 6 (Cayley transform) Let be the upper half-plane. Show that the Cayley transform , defined by
is a complex diffeomorphism from the upper half-plane to the disk , with inverse map given by
Exercise 7 Show that for any real numbers , the strip is complex diffeomorphic to the disk . (Hint: use the complex exponential and a linear transformation to map the strip onto the half-plane .)
Exercise 8 Show that for any real numbers , the strip is complex diffeomorphic to the disk . (Hint: use a branch of either the complex logarithm, or of a complex power .)
We will discuss some other explicit conformal maps in this set of notes, such as the Schwarz-Christoffel maps that transform the upper half-plane to polygonal regions. Further examples of conformal mapping can be found in the text of Stein-Shakarchi.
— 1. Maps between the model Riemann surfaces —
In this section we study the various holomorphic maps, and conformal maps, between the three model Riemann surfaces , , and .
From Exercise 19 of Notes 4, we know that the only holomorphic maps from the Riemann sphere to itself (besides the constant function ) take the form of a rational function away from the zeroes of (and from ), with these singularities all being removable, and with not identically zero. We can of course reduce to lowest terms and assume that and have no common factors. In particular, if is to take values in rather than , then can have no roots (since will have a pole at these roots) and so by the fundamental theorem of algebra is constant and is a polynomial; in order for to have no pole at infinity, must then be constant. Thus the only holomorphic maps from to are the constants; in particular, the only holomorphic maps from to are the constants. In particular, is not complex diffeomorphic to or (this is also topologically obvious since the Riemann sphere is compact, and and are not).
Exercise 9 More generally, show that if is a compact Riemann surface and is a connected non-compact Riemann surface, then the only holomorphic maps from to are the constants. (Hint: use the open mapping theorem, Theorem 37 of Notes 4.)
Now we consider complex automorphisms of the Riemann sphere to itself. There are some obvious examples of such automorphisms:
More generally, given any complex numbers with , we can define the Möbius transformation (or fractional linear transformation) for , with the convention that is mapped to and is mapped to (where we adopt the further convention that for non-zero ). For , this is an affine transformation , which is clearly a composition of a translation and dilation map; for , this is a combination of translations, dilations, and the inversion map. Thus all Möbius transformations are formed from composition of the translations, dilations, and inversions, and in particular are also automorphisms of the Riemann sphere; it is also easy to see that the Möbius transformations are closed under composition, and are thus the group generated by the translations, dilations, and inversions.
One can interpret the Möbius transformations as projective linear transformations as follows. Recall that the general linear group is the group of matrices with non-vanishing determinant . Clearly every such matrix generates a Möbius transformation . However, two different elements of can generate the same Möbius transformation if they are scalar multiples of each other. If we define the projective linear group to be the quotient group of by the group of scalar invertible matrices, then we may identify the set of Möbius transformations with . The group acts on the space by the usual map
If we let be the complex projective line, that is to say the space of one-dimensional subspaces of , then acts on this space also, with the action of the scalars being trivial, so we have an action of on . We can identify the Riemann sphere with the complex projective line by identifying each with the one-dimensional subspace of , and identifying with . With this identification, one can check that the action of on has become identified with the action of the group of Möbius transformations on . (In particular, the group of Möbius transformations is isomorphic to .)
There are enough Möbius transformations available that their action on the Riemann sphere is not merely transitive, but is in fact -transitive:
Lemma 10 (-transitivity) Let be distinct elements of the Riemann sphere , and let also be three distinct elements of the Riemann sphere. Then there exists a unique Möbius transformation such that for .
Proof: We first show existence. As the Möbius transformations form a group, it suffices to verify the claim for a single choice of , for instance . If then the affine transformation will have the desired properties. If , we can use translation and inversion to find a Möbius transformation that maps to ; applying the previous case with with and then applying , we obtain the claim.
Now we prove uniqueness. By composing on the left and right with Möbius transforms we may assume that . A Möbius transformation that fixes must obey the constraints and so must be the identity, as required.
Möbius transformations are not 4-transitive, thanks to the invariant known as the cross-ratio:
Exercise 11 Define the cross-ratio between four distinct points on the Riemann sphere by the formula
if all of avoid , and extended continuously to the case when one of the points equals (e.g. ).
- (i) Show that an injective map is a Möbius transform if and only if it preserves the cross-ratio, that is to say that for all distinct points . (Hint: for the “only if” part, work with the basic Möbius transforms. For the “if” part, reduce to the case when fixes three points, such as .)
- (ii) If are distinct points in , show that lie on a common extended line (i.e., a line in together with ) or circle in if and only if the cross-ratio is real. Conclude that a Möbius transform will map an extended line or circle to an extended line or circle.
As one quick application of Möbius transformations, we have
Proposition 12 is simply connected.
Proof: We have to show that any closed curve in is contractible to a point in . By deforming locally into line segments in either of the two standard coordinate charts of we may assume that is the concatenation of finitely many such line segments; in particular, cannot be a space-filling curve (as one can see from e.g. the Baire category theorem) and thus avoids at least one point in . If avoids then it lies in and can thus be contracted to a point in (and hence in ) since is convex. If avoids any other point , then we can apply a Möbius transformation to move to , contract the transformed curve to a point, and then invert the Möbius transform to contract to a point in .
Exercise 13 (Jordan curve theorem in the Riemann sphere) Let be a simple closed curve in the Riemann sphere. Show that the complement of in is the union of two disjoint simply connected open subsets of . (Hint: one first has to exclude the possibility that is space-filling. Do this by verifying that is homeomorphic to the unit circle.)
It turns out that there are no other automorphisms of the Riemann sphere than the Möbius transformations:
Proposition 14 (Automorphisms of Riemann sphere) Let be a complex diffeomorphism. Then is a Möbius transformation.
Proof: By Lemma 10 and composing with a Möbius transformation, we may assume without loss of generality that fixes . From Exercise 19 of Notes 4 we know that is a rational function (with all singularities removed); we may reduce terms so that have no common factors. Since is bijective and fixes , it has no poles in , and hence can have no roots; by the fundamental theorem of algebra, this makes constant. Similarly, has no zeroes other than , and so must be a monomial; as also fixes , it must be of the form for some natural number . But this is only injective if , in which case is clearly a Möbius transformation.
Now we look at holomorphic maps on . There are plenty of holomorphic maps from to ; indeed, these are nothing more than the entire functions, of which there are many (indeed, an entire function is nothing more than a power series with an infinite radius of convergence). There are even more holomorphic maps from to , as these are just the meromorphic functions on . For instance, any ratio of two entire functions, with not identically zero, will be meromorphic on . On the other hand, from Liouville’s theorem (Theorem 28 of Notes 3) we see that the only holomorphic maps from to are the constants. In particular, and are not complex diffeomorphic (despite the fact that they are diffeomorphic over the reals, as can be seen for instance by using the projection ).
The affine maps with and are clearly complex automorphisms on . In analogy with Proposition 14, these turn out to be the only automorphisms:
Proposition 15 (Automorphisms of complex plane) Let be a complex diffeomorphism. Then is an affine transformation for some and .
Proof: By the open mapping theorem (Theorem 37 of Notes 4), is open, and hence avoids the non-empty open set on . By the Casorati-Weierstrass theorem (Theorem 11 of Notes 4), we conclude that does not have an essential singularity at infinity. Thus extends to a holomorphic function from to , hence by Exercise 19 of Notes 4 is rational. As the only pole of is at infinity, is a polynomial; as is a diffeomorphism, the derivative has no zeroes and is thus constant by the fundamental theorem of algebra. Thus must be affine, and the claim follows.
Exercise 16 Let be an injective holomorphic map. Show that is a Möbius transformation (restricted to ).
We remark that injective holomorphic maps are often referred to as univalent functions in the literature.
Finally, we consider holomorphic maps on . There are plenty of holomorphic maps from to (indeed, these are just the power series with radius of convergence at least ), and even more holomorphic maps from to (for instance, one can take the quotient of two holomorphic functions with non-zero). There are also many holomorphic maps from to , for instance one can take any bounded holomorphic function and multiply it by a small constant. However, we have the following fundamental estimate concerning such functions, the Schwarz lemma:
Lemma 17 (Schwarz lemma) Let be a holomorphic map such that . Then we have for all . In particular, .
Furthermore, if for some , or if , then there exists a real number such that for all .
Proof: By the factor theorem (Corollary 22 of Notes 3), we may write for some holomorphic . On any circle with , we have and hence ; by the maximum principle we conclude that for all . Sending to zero, we conclude that for all , and hence and .
Finally, if for some or , then equals for some , and hence by a variant of the maximum principle (see Exercise 18 below) we see that is constant, giving the claim.
Exercise 18 (Variant of maximum principle) Let be a connected Riemann surface, and let be a point in .
- (i) If is a harmonic function such that for all , then for all .
- (ii) If is a holomorphic function such that for all , then for all .
(Hint: use Exercise 17 of Notes 3 .)
One can think of the Schwarz lemma as follows. Let denote the collection of holomorphic functions with . Inside this collection we have the rotations for defined by . The Schwarz lemma asserts that these rotations “dominate” the remaining functions in in the sense that on , and in particular ; furthermore these inequalities are strict as long as is not one of the .
As a first application of the Schwarz lemma, we characterise the automorphisms of the disk . For any , one can check that the Möbius transformation preserves the boundary of the disk (since when ), and maps the point to the origin, and thus maps the disk to itself. More generally, for any and , the Möbius transformation is an automorphism of the disk . It turns out that these are the only such automorphisms:
Theorem 19 (Automorphisms of disk) Let be a complex diffeomorphism. Then there exists and such that for all . If furthermore , then we can take , thus for .
Proof: First suppose that . By the Schwarz lemma applied to both and its inverse , we see that . But by the inverse function theorem (or the chain rule), , hence . Applying the Schwarz lemma again, we conclude that for some , as required.
In the general case, there exists such that . If one then applies the previous analysis to , where is the automorphism , we obtain the claim.
Exercise 20 (Automorphisms of half-plane) Let be a complex diffeomorphism from the upper half-plane to itself. Show that there exist real numbers with such that for . Conclude that the automorphism group of either or is isomorphic as a group to the projective special linear group formed by starting with the special linear group of real matrices of determinant , and then quotienting out by the central subgroup .
Remark 21 Collecting the various assertions above about the holomorphic maps between the elliptic, parabolic, and hyperbolic model Riemann surfaces , , , one arrives at the following rule of thumb: there are “many” holomorphic maps from “more hyperbolic” surfaces to “less hyperbolic” surfaces, but “very few” maps going in the other direction (and also relatively few automorphisms from one space to an “equally hyperbolic” surface). This rule of thumb also turns out to be accurate in the context of compact Riemann surfaces, where “higher genus” becomes the analogue of “more hyperbolic” (and similarly for “less hyperbolic” or “equally hyperbolic”). One can formalise this latter version of the rule of thumb using such results as the Riemann-Hurwitz formula and the de Franchis theorem, but these are beyond the scope of this course.
Exercise 22 Let be a non-constant holomorphic map between Riemann surfaces . If is compact and is connected, show that is surjective and is compact. Conclude in particular that there are no non-constant bounded holomorphic functions on a compact Riemann surface.
— 2. Quotients of the model Riemann surfaces —
The three model Riemann surfaces , , are all simply connected, and the uniformisation theorem will tell us that up to complex diffeomorphism, these are the only simply connected Riemann surfaces that exist. However, it is possible to form non-simply-connected Riemann surfaces from these model surfaces by the procedure of taking quotients, as follows. Let be a Riemann surface, and let be a group of complex automorphisms of . We assume that the action of on is free, which means that the non-identity transformations in have no fixed points (thus for all ). We also assume that the action is proper (viewing as a discrete group), which means that for any compact subset of , there are only finitely many automorphisms in for which intersects . If the action is both free and proper, then we see that every point has a small neighbourhood with the property that the images are all disjoint; by making small enough we can also find a holomorphic coordinate chart to some open subset of . We can then form the quotient manifold of orbits , using the coordinate charts for any defined by setting
for all . One can easily verify that is a Riemann surface, and that the quotient map defined by is a surjective holomorphic map. The ability to easily take quotients is one of the key advantages of the Riemann surface formalism; another is the converse ability to construct covers, such as the universal cover of a Riemann surface, defined in Theorem 25 below.
Exercise 23 Let be a Riemann surface, a group of complex automorphisms of acting in a proper and free fashion, and let be the quotient map. Let be a holomorphic map to another Riemann surface . Show that there exists a holomorphic map such that if and only if for all .
Remark 24 It is also of interest to quotient a Riemann surface by a group of complex automorphisms whose action is not free. In that case, the quotient space need not be a manifold, but is instead a more general object known as an orbifold. A typical example is the modular curve (where is the group of matrices with integer coefficients and determinant ); this is of great importance in analytic number theory. However, we will not study orbifolds in this course.
Since the continuous image of a connected space is always connected, we see that any quotient of a connected Riemann surface is again connected. In the converse direction, one can use this construction to describe a connected Riemann surface as a quotient of a simply connected Riemann surface:
Theorem 25 (Universal cover) Let be a connected Riemann surface. Then there exists a simply connected Riemann surface , and a group of complex automorphisms acting on in a proper and free fashion, such that is complex diffeomorphic to .
Proof: For sake of brevity we omit some of the details of the construction as exercises.
We use the following abstract construction to build the Riemann surface . Fix a base point in . For any point in , we can form the space of all continuous paths from to for some interval with . We let denote the space of equivalence classes of such paths with respect to the operation of homotopy with fixed endpoints up to reparameterisation; for instance, if was simply connected, then would simply be a point. (One could also omit the reparameterisation by restricting the domain of to be a fixed interval such as .) As is connected, all the are non-empty. We then let be the (disjoint) union of all the . This defines a set together with a projection map that sends all the homotopy classes in to for each ; this is clearly a surjective map.
This defines as a set, but we want to give the structure of a Riemann surface, and thus must create an atlas of coordinate charts. For every , let be a coordinate chart that is a diffeomorphism between some neighbourhood of and the unit disk. Given a homotopy class in and a point in , we can then associate a point in by taking a path from to in the homotopy class , and concatenating it with the path that connects to via a line segment in the disk using the coordinate chart ; the homotopy class of this concatenated path does not depend on the precise choice of and will be denoted . If we let denote all the points obtained in this fashion as varies over , then it is easy to see (Exercise!) that the are disjoint and partition the set . We can then form coordinate charts for each and by setting for all . This defines both a topology on (by declaring a subset of to be open if is open for all ) and a complex structure, as the transition maps are easily verified (Execise!) to be both continuous and holomorphic (after first shrinking the neighbourhoods and if necessary). By construction we now see that is a covering space of , with the covering map.
Let be the homotopy class of the constant curve at . It is easy to see (Exercise!) that is connected (indeed, any point in determines (more or less tautologically) a family of paths in from to ). Next, we make the stronger claim that is simply connected. It suffices to show that any closed path from to is contractible to a point. Let denote the projected curve , thus is a closed curve from to itself. From the continuity method (Exercise!) we see that for any , the restriction of to lies in the homotopy class of ; in particular, itself lies in the homotopy class of , and is thus homotopic to a point. Another application of the continuity method (Exercise!) then shows that as one continuously deforms to a point, each of the curves obtained in this deformation lifts to a closed curve in from to , in the sense that ; furthermore, varies continuously in , giving the required homotopy from to a point.
Define a deck transformation to be a holomorphic map such that (that is to say, preserves each of the “fibres” of ). Clearly the composition of two deck transformations is again a deck transformation. From Corollary 50 of Notes 4 we see that for any and , there exists a unique deck transformations that maps to . Composing that a deck transformations with the deck transformations that maps to we see that all deck transformations are invertible and are thus complex automorphisms. If we let denote the collection of all deck transformations then we see that is a group that acts freely on and transitively on each fibre . For any , and the neighbourhoods as before, one can verify (Exercise!) that each deck transformation in permutes the disjoint open sets covering , and given any two of these sets there is exactly one deck transformation that maps to . From this one can check (Exercise!) that is complex diffeomorphic to as required.
Exercise 26 Write out the steps marked “Exercise!” in the above argument.
The manifold in the above theorem is called a universal cover of , and the group is (a copy of) the fundamental group of . These objects are basically uniquely determined by :
Exercise 27 Suppose one has two simply connected Riemann surfaces and two groups of automorphisms of respectively acting in proper and free fashions. Show that the following statements are equivalent:
- (i) The quotients and are complex diffeomorphic.
- (ii) There exists a complex diffeomorphism and a group isomorphism such that
for all and . In particular and are complex diffeomorphic, and the groups and are isomorphic.
(Hint: use Exercise 23 for one direction of the implication, and Corollary 50 of Notes 4 for the other implication.)
Exercise 28 Let be a connected Riemann surface, and let be a point in . Define the fundamental group based at to be the collection of equivalence classes of closed curves from to , under the relation of homotopy with fixed endpoints up to reparameterisation.
- (i) Show that is indeed a group, with the equivalence class of the constant curves as the identity element, the inverse of a homotopy class of a curve defined as , and the product of two homotopy classes of curves as .
- (ii) If are as in Theorem 25, show that is isomorphic to .
Exercise 29 Show that the fundamental group of is isomorphic to the integers (viewed as an additive group).
If we assume for now the uniformisation theorem, we conclude that every connected Riemann surface is the quotient of one of the three model surfaces , , by a group of complex automorphisms that act freely and properly; depending on which surface is used, we call these Riemann surfaces of elliptic type, parabolic type, and hyperbolic type respectively. We can then study each of the three model types in turn:
Elliptic type: By Proposition 14, the automorphisms of are the Möbius transformations. From the quadratic formula (or the fundamental theorem of algebra) we see that every Möbius transformation has at least one fixed point (for instance, the translations fix ). Thus the only group of complex automorphisms that can act freely on is the trivial group, so the only Riemann surfaces of elliptic type are those that are complex diffeomorphic to the Riemann sphere.
Parabolic type: By Proposition 14, the automorphisms of are the affine transformations . These transformations have fixed points in if , so in order to obtain a free action we must restrict to the translations . Thus we can view as an additive subgroup of , with now being the group quotient; as the action is additive, we can also write as . In order for the action to be proper, must be a discrete subgroup of (every point isolated). We can classify all such subgroups:
Exercise 30 Let be a discrete additive subgroup of . Show that takes on one of the following three forms:
- (i) (Rank zero case) the trivial group ;
- (ii) (Rank one case) a cyclic group for some ; or
- (iii) (Rank two case) a group for some with strictly complex (i.e., not real).
We conclude that every Riemann surface of parabolic type is complex diffeomorphic to a plane , a cylinder for some , or a torus for and strictly complex.
The case of the plane is self-explanatory. Using dilation maps we see that all cylinders are complex diffeomorphic to each other; for instance, they are all diffeomorphic to . The exponential map is -periodic and thus descends to a map from to ; it is easy to see that this map is a complex diffeomorphism, thus the punctured plane can be used as a model for all Riemann surface cylinders.
The case of the tori are more interesting. One can use dilations to normalise one of the to be a specific value such as , but one cannot normalise both:
Exercise 31 Let and be two tori. Show that these two tori are complex diffeomorphic if and only if there exists an element of the special linear group (thus are integers with ) such that
(Hint: lift any such diffeomorphism to a holomorphic map from to of linear growth.)
In contrast to the cylinders , which are complex diffeomorphic to a subset of the complex plane, one cannot model a torus by a subset of ; indeed, if there were a complex diffeomorphism , then would have to be non-empty, compact, and (by the open mapping theorem) open in , which is impossible since is non-compact and connected. However, it is an important fact in algebraic geometry, classical analysis and number theory that these tori can be modeled instead by elliptic curves. The theory of elliptic curves is extremely rich, but is beyond the scope of this course and will not be discussed further here (but the Weierstrass elliptic functions used to construct the complex diffeomorphism between tori and elliptic curves may be covered in subsequent quarters).
Exercise 32 Let be a connected subset of that omits at least two points of . Show that cannot be of elliptic or parabolic type. (Hint: in addition to the open mapping theorem argument given above, one can use either the great Picard theorem, Theorem 56 of Notes 4, or the simpler Casorati-Weierstrass theorem (Theorem 11 of Notes 4).) In particular, assuming the uniformisation theorem, such sets must be of hyperbolic type. (Note this is compatible with our previous intuition that “more hyperbolic” is analogous to “higher genus” or “has more holes”.)
Hyperbolic type: Here it is convenient to model the hyperbolic Riemann surface using the upper half-plane (the Poincaré half-plane model) rather than the disk (the Poincaré disk model). By Exercise 20, a Riemann surface of hyperbolic type is then isomorphic to a quotient of by some subgroup of that acts freely and properly. Properness is easily seen to be equivalent to being a discrete subgroup of (using the topology inherited from the embedding of in ); such groups are known as Fuchsian groups. Freeness can also be described explicitly:
Exercise 33 Show that a subgroup of acts freely on if and only if it avoids all (equivalence classes) of matrices in that are elliptic in the sense that they obey the trace condition .
It turns out that in contrast to the elliptic type and parabolic type situations, there are a very large number of possible subgroups obeying these conditions, and a complete classification of them is basically hopeless. The theory of Fuchsian groups is again very rich, being a foundational topic in hyperbolic geometry, but is again beyond the scope of this course.
Remark 34 The twice-punctured plane must be of hyperbolic type by the uniformisation theorem and Exercise 32. This gives another proof of the little Picard theorem: an entire function that omits (say) the points must then lift (by Corollary 50 of Notes 4) to a holomorphic map from to , which must then be constant by Liouville’s theorem. A more complicated argument along these lines also proves the great Picard theorem. It turns out that the covering map from to can be described explicitly using the theory of elliptic functions (and specifically the modular lambda function), but this is beyond the scope of this course.
Exercise 35 (Schottky’s theorem) Show that any annulus is of hyperbolic type, and is in fact complex diffeomorphic to for some cyclic group of dilations. (Hint: first use the complex exponential to cover the annulus by a strip, then use Exercise 7.)
Exercise 36 Let and be two annuli with and . Show that and are complex diffeomorphic if and only if . (Hint: one can either argue by lifting to the half-plane using the previous exercise, or else using the Schwarz reflection principle (adapted to circles in place of lines) repeatedly to extend a holomorphic map from to to a holomorphic map from a punctured disk to a punctured disk; one can also combine the methods by taking logarithms to lift , to strips, and then using the original Schwarz reflection principle.)
Exercise 37
- (i) Show that the punctured disk is of hyperbolic type, and is complex diffeomorphic to , where acts on by translations.
- (ii) Show that the Jowkowsky transform is a complex diffeomorphism from to the slitted extended complex plane . Conclude that is also complex diffeomorphic to .
— 3. The Riemann mapping theorem —
We are now ready to prove the Riemann mapping theorem, Theorem 4, using an argument of Koebe. To motivate the argument, let us rephrase the Schwarz lemma in the following form:
Lemma 38 (Schwarz lemma, again) Let be a Riemann surface, and let be a point in . Let denote the collection of holomorphic functions with . If contains an element that is a complex diffeomorphism, then for all ; if is a subset of the complex plane , we also have . Furthermore, in either of these two inequalities, equality holds if and only if for some real number .
Proof: Apply Lemma 17 to the map .
This lemma suggests the following strategy to prove the Riemann mapping theorem: starting with the open subset of the complex plane , pick a point in that subset, and form the collection of holomorphic maps that map to , and locate an element of this collection for which the magnitude is maximal. If the Riemann mapping theorem were true, then Lemma 38 would ensure that this would be a complex diffeomorphism, and we would be done.
It turns out to be convenient to work with the somewhat smaller collection of injective holomorphic maps (also known as univalent functions from to ). We first observe that this collection is non-empty for the sets of interest:
Proposition 39 Let be a simply connected subset of that is not all of . Then there exists an injective holomorphic map .
Proof: By applying a translation to , we may assume that avoids the origin . If in fact avoided a disk , then we could use the map to map injectively into the disk . At present, need not avoid any disk (e.g. could be the complex plane with the negative axis removed). However, as is simply connected and avoids , we can argue as in Section 4 of Notes 4 to obtain a holomorphic branch of the square root function, that is to say a holomorphic map such that for all . As is injective, must also be injective; it is also clearly non-constant, so from the open mapping theorem is open and thus contains some disk . But if lies in then cannot lie in since this would make the map non-injective; thus avoids a disk , and the claim follows.
If is a point in , then the map constructed by the above proposition need not map to the origin, but this is easily fixed by composing with a suitable automorphism of . To prove the Riemann mapping theorem, it will thus suffice to show
Proposition 40 Let be a simply connected Riemann surface, let be a point in , and let be the collection of injective holomorphic maps with . If is non-empty, then is complex diffeomorphic to .
Proof: By identifying with its image under one of the elements of , we may assume without loss of generality that is itself an open subset of , with .
Define the quantity
As contains the identity map, is at least ; from the Cauchy inequalities (Corollary 27 of Notes 3) we see that is finite. Hence there exists a sequence in with converging to as . From Montel’s theorem (Exercise 57(i) of Notes 4) we know that is a normal family, so on passing to a subsequence we may assume that the converge locally uniformly to some limit . By Hurwitz’s theorem (Exercise 41 of Notes 4), the limit is holomorphic and is either injective or constant. But from the higher order Cauchy integral formula (Theorem 25 of Notes 3), converges to , hence and so cannot be constant, and is thus injective. From the maximum principle (Exercise 18), we know that takes values in , and not just .
To conclude the proposition, we need to show that is also surjective. Here we use a variant of the argument used to prove Proposition 39. Suppose for contradiction that avoids some point in . Let be an automorphism of that sends to , then avoids the origin. As is simply connected, we can thus find a holomorphic square root of , thus
Since and hence are injective, is also. Finally, if is an automorphism of that sends to , then the map lies in . The map is related to by the formula
where is the squaring map. Observe that the map is a holomorphic map from to that maps to , and is not a rotation map (since is not a Möbius transformation). Thus by the Schwarz lemma (Lemma 17), we have
and hence by the chain rule
But this contradicts the definition of , and we are done.
Exercise 41 Let be an open connected non-empty subset of . Show that the following are equivalent:
- (i) is simply connected.
- (ii) One has for every holomorphic function and every closed curve in .
- (iii) For every holomorphic function there exists a holomorphic with .
- (iv) For every holomorphic function there exists a holomorphic with .
- (v) The complement of in the Riemann sphere is connected.
(Hint: to relate (v) to the other claims, use Exercise 43 from Notes 4.)
— 4. Schwarz-Christoffel mappings —
The Riemann mapping theorem guarantees the existence of complex diffeomorphisms for any simply connected subset of the complex plane that is not all of ; in particular, such diffeomorphisms exist if is a polygon, by which we mean the interior region of a simple closed anticlockwise polygonal path . However, the proof of the Riemann mapping theorem does not supply an easy way to compute what this map is. Nevertheless, in the case of polygons a reasonably explicit formula for (or more precisely, for the derivative of the inverse of ) may be found. Our arguments here are based on those in the text of Ahlfors.
We set up some notation. Let be a simple closed anticlockwise polygonal path (in particular, the are all distinct), let be the polygon enclosed by this path, and let be a complex diffeomorphism, the existence of which is guaranteed by the Riemann mapping theorem. (The map is only unique up to composition by an automorphism of , but this will not concern us for the present analysis.) We adopt the convention that and , and for , we let denote the counterclockwise angle subtended by the polygon at (normalised by a factor of ), in the sense that
for some real . (Note that cannot attain the values or as this would cause the polygona path to be non-simple.) It is also convenient to introduce the normalised exterior angle by (thus is positive at a convex angle of the polygon, zero at a reflex angle, and negative at a concave angle), so that
Telescoping this identity, we conclude that must be an even integer. Indeed, from Euclidean geometry we know that the sum of the exterior angles of a polygon add up to , so that
we will give an analytic proof of this fact presently.
From the Alexander numbering rule (Exercise 55 of Notes 3) we see that always lies to the left of the polygonal path . We can formalise this statement as follows. First suppose that is a non-vertex boundary point of , thus for some and . Then we can form the affine map by the formula
and the numbering rule tells us that for small enough, the half-disk
is mapped holomorphically by into . If is instead a vertex of , the situation is a little trickier; we now define the map by the formula
where we choose the branch of with branch cut at the negative imaginary axis and to be positive real on the positive real axis. Then again will map holomorphically into for small enough. (The reader is encouraged to draw a picture to understand these maps.)
Now we perform some local analysis near the boundary. We first need a version of the Schwarz reflection principle (Exercise 37 of Notes 3) for harmonic functions.
Exercise 42 (Dirichlet problem) Let be a continuous function. Show that there exists a unique function that is continuous on the closed disk , harmonic on the open disk , and equal to on the boundary . Furthermore, show that is given by the formula
for , where is the Poisson kernel
(compare with Exercise 17 of Notes 3).
Lemma 43 (Schwarz reflection for harmonic functions) Let be an open subset of symmetric around the real axis, and let be a continuous function on the region that vanishes on and is harmonic in . Let be the antisymmetric extension of , defined by setting and for . Then is harmonic.
Proof: Morally speaking, this lemma follows from the analogous reflection principle for holomorphic functions, but there is a difficulty because we do not have enough regularity on the real axis to easily build a harmonic conjugate that is continuous all the way to the real axis. Instead we shall rely on the maximum principle as follows.
It is clear that is continuous and harmonic away from the real axis, so it suffices to show for any and any small that is harmonic on .
Using Exercise 42, we can find a continuous function which agrees with on the boundary and is harmonic on the interior. From the antisymmetry of and uniqueness (or the Poisson kernel formula) we see that is also antisymmetric and thus vanishes on the real axis. The difference is then harmonic on the half-disks and and vanishes on the boundary of these half-disks, so by the maximum principle they vanish everywhere in . Thus agrees with on and is therefore harmonic on this disk as required.
Proposition 44 Let be a boundary point of (which may or may not be a vertex). Then for small enough, the maps extend holomorphically to a map from to which maps the origin to a point on the unit circle. Furthermore, this map is injective for small enough.
Proof: For any , the preimage of the closed disk is a compact subset of and thus stays a positive distance away from the boundary of . In particular, for sufficiently close to the boundary of , must exceed . We conclude that the function extends continuously to a map from to , by declaring the map to equal on the boundary. In particular, for small enough, the map also extends continously to , and equals on the real boundary of . For small enough, avoids zero on this region, and so the function will extend continuously to , and vanish on the real portion of the boundary. By taking local branches of we see that this function is also harmonic. By Lemma 43, extends harmonically to , and on taking harmonic conjugates we conclude that extends holomorphically to . Taking exponentials, we obtain a holomorphic extension of to , with . To prove injectivity, it suffices (shrinking as necessary) to show that the derivative of at is non-zero. But if this were not the case, then would have a zero of order at least two, which by the factor theorem implies that would not map to a half-plane bordering the origin, and in particular cannot map to , a contradiction.
As a corollary, we see that extends to a continuous map that maps to , and around every point in the boundary of , maps a small neighbourhood of in to a small neighbourhood of in . As is injective on , this implies that is also injective on the boundary of . The image is compact in and contains , hence is in fact a bijective continuous map between compact Hausdorff spaces and is thus a homoeomorphism. Thus we can form an inverse map , which maps holomorphically to . (This latter claim in fact works if one replaces the polygonal path by a arbitrary simple closed curve; this is a theorem of Carathéodory.)
Consider the function on the line segment from to . By Proposition 44, is smooth on this line segment, has non-zero derivative, and takes values in ; setting , we see that must traverse a simple curve from to in . As is orientation preserving, lies to the left of the line segment , and the disk lies to the left of traversed anticlockwise, we see that must traverse the anticlockwise arc from to . Following all around , we see that must be arranged anticlockwise in the unit circle in the sense that we have for all for some
Inverting, we see that for any , smoothly maps the anticlockwise arc from to to the line segment from to , with derivative nonvanishing. Thus on taking arguments
Next, we study near (and near ) for some . From Proposition 44 we see that in a sufficiently small neighbourhood of in , one has for some injective holomorphic map from a neighbourhood of in to a neighbourhood of in that maps to zero. Since maps the arc from to to the line segment from to , must map the portion of the arc from to near to a portion of the positive real axis; in particular, by the chain rule, is a positive real, call it . If we factor
noting that the third factor is close to one and the second factor lies in the upper half-plane, we have
and hence from we have the factorisation
for near in , for some that is holomorphic and non-zero in a neighbourhood of in . Differentiating using , we conclude that
for near in , for some that is also holomorphic and non-zero in a neighbourhood of in .
The function is holomorphic and non-vanishing; as is simply connected, we must therefore have for some holomorphic (by Exercise 46 of Notes 4). For any between and , we see from the previous discussion that extends holomorphically to a neighbourhood of , with non-vanishing at , so extends also. From (3) we see that the argument of is constant on the interval , and hence
is also constant on this interval. Meanwhile, from (4) we see that for near in , we have
for some holomorphic in a neighbourhood of in , where is a branch of the complex logarithm with branch cut at . From this we see that the function has a jump discontinuity with jump as crosses . As this function clearly increases by when increases by , we conclude the geometric identity (2).
Now consider the modified function defined by
Then is holomorphic on , and by the above analysis it extends continuously to . We consider the imaginary part at ,
where is a branch of the argument function with branch cut at . Writing , we see that is constant as long as is not an integer multiple of . From this, (5), and (2), we see that the function is constant on each arc . Thus the function is harmonic on , continuous on , and constant on the boundary , so by the maximum principle it is constant, which from the Cauchy-Riemann equations makes constant also. Thus we have
on for some complex constant , which on exponentiating gives
on for some non-zero complex constant . Applying the fundamental theorem of calculus, we obtain the Schwarz-Christoffel formula:
Theorem 45 (Schwarz-Christoffel for the disk) Let be a closed simple anticlockwise polygonal path, and define the exterior angles as above. Let be the polygon enclosed by this path, and let be a complex diffeomorphism. Then there exist phases , for some , a non-zero complex number , and a complex number such that
for all , where the integral is over an arbitrary curve from to , and one selects a branch of with branch cut on the negative imaginary axis . Furthermore, converges to as approaches for every .
Note that one can change the branches of here, and also modify the normalising factors , by adjusting the constant in a suitable fashion, as long one does not move the branch cut for into the disk ; one can similarly change the initial point of the curve to any other point in by adjusting . By taking log-derivatives in (6), we can also express the Schwarz-Christoffel formula equivalently as a partial fractions decomposition of :
The Schwarz-Christoffel formula does not completely describe the conformal mappings from to the disk, because it does not specify exactly what the phases and the complex constants are. As the group of automorphisms of has three degrees of freedom (one real parameter and one complex parameter ), one can for instance fix three of the phases , but in general there are no simple formulae to then reconstruct the remaining parameters in the Schwarz-Christoffel formula, although numerical algorithms exist to compute them approximately. (In the case when the polygon is a rectangle, though, the Schwarz-Christoffel formula essentially produces an elliptic integral, and the complex diffeomorphisms from the rectangle to the disk or half-space are closely tied to elliptic functions; see Section 4.5 of Stein-Shakarchi for more discussion.)
Exercise 46 (Schwarz-Christoffel in a half-space) Let be a closed simple anticlockwise polygonal path, and define the exterior angles as above. Let be the polygon enclosed by this path, and let be a complex diffeomorphism from the upper half-plane to .
- (i) Show that extends to a homeomorphism from the closure of the upper half-plane in the Riemann sphere to , and that all lie on .
- (ii) If all of the are finite, show that after a cyclic permutation one has , and that there exists a non-zero complex number , and a complex number such that
for all , where the integral is over any curve from to .
- (iii) If one of the are infinite, show after a cyclic permutation that one has and , and there exists a non-zero complex number , and a complex number such that
for all .
Remark 47 One could try to apply the Schwarz-Christoffel formula to a closed polygonal path that is not simple. In such cases (and after choosing the parameters correctly), what tends to happen is that the map still maps the circle to the closed path, but fails to be injective.
Exercise 48 Let be a complex diffeomorphism from the half-strip to the upper half-plane , which extends to a continuous map to the closures of , in the Riemann sphere. Suppose that maps to respectively. Show that , where we take the branch of the square root that is positive on the real axis and has a branch cut at . (Hint: is not quite a polygon, so one cannot directly apply the Schwarz-Christoffel formula; however the proof of that formula will still apply.)
— 5. The uniformisation theorem (optional) —
Now we discuss a proof of the uniformisation theorem, Theorem 5, following the approach in these notes of Marshall. Unfortunately the argument is rather complicated, and we will only give a portion of the proof here. One of the many difficulties in trying to prove this theorem is the fact that the conclusion is a disjunction of three alternatives, each with a rather different complex geometry; it would be easier if there was only one target geometry that one was trying to impose on the Riemann surface . To begin separating the three geometries from each other, recall from Liouville’s theorem that there are no non-constant bounded holomorphic functions on or , but plenty of non-constant bounded holomorphic functions on . By Lemma 1, the same claims hold for Riemann surfaces that are complex diffeomorphic to or or to respectively. Note that without loss of generality we may normalise “bounded” by replacing it with “mapping into “. From this we see that the uniformisation theorem can be broken up into two simpler pieces:
Theorem 49 (Uniformisation theorem, hyperbolic case) Let be a simply connected Riemann surface that admits a non-constant holomorphic map from to . Then is complex diffeomorphic to .
Theorem 50 (Uniformisation theorem, non-hyperbolic case) Let be a simply connected Riemann surface that does not admit a non-constant holomorphic map from to . Then is complex diffeomorphic to or .
Let us now focus on the hyperbolic case of the uniformisation theorem, Theorem 49. Now we do not have the disjunction problem as there is only one target geometry to impose on ; we will be able to give a complete proof of this theorem here (in contrast to Theorem 50, where we will only give part of the proof). Let be a point in , and recall that denotes the collection of holomorphic maps that maps to . By hypothesis (and applying a suitable automorphism of ), contains at least one non-constant map. If Theorem 49 were true, then from Lemma 38 we see that would contain a “maximal” element which would exhibit the desired complex diffeomorphism between and .
It turns out that the converse statement is true: if we can locate “maximal” elements of with certain properties, then we can prove Theorem 49. More precisely, Theorem 49 can be readily deduced from the following claim.
Theorem 51 (Maximal maps into ) Let be a simply connected Riemann surface, let be a point in , and let be the collection of holomorphic maps from to that map to . Suppose that contains a non-constant map. Then contains a map with the property that for all , with equality only if for some real number . Furthermore has a simple zero at , and no other zeroes.
We have seen how Theorem 49 implies Theorem 51. Let us now demonstrate the converse implication, assuming Theorem 51 for the moment and deriving Theorem 49. Let be a simply connected Riemann surface that admits non-constant holomorphic maps from to , and pick a point in . By applying a suitable automorphism of we see that has a non-constant map, so by Theorem 51 this collection contains an element with the stated properties. If were injective, then we could apply Proposition 40 to conclude that and were complex diffeomorphic, so suppose for contradiction that was not injective. Since has a zero only at , we thus have for some distinct . Let be the automorphism
that maps to and to , then the function lies in and also has a zero at . From Theorem 51, we thus have
since vanishes, we thus have from the definition of that
Swapping the roles of and gives the reverse inequality, thus we in fact have
Applying Theorem 51 again, we conclude that
for some . But has a zero at while cannot have any zeroes other than at , a contradiction.
Remark 52 We only established that was injective in the above argument, but by inspecting the proof of Proposition 40 and using the maximality properties of we see that is also surjective, and thus supplies the required complex diffeomorphism between and . In a similar vein, the arguments in the preceding section show that under the hypotheses of Theorem 49, there exists a surjective map from to , but one needs something like Theorem 51 to obtain the crucial additional property of injectivity (which was automatic in the preceding section, since one already started with an injection in hand).
To finish off the hyperbolic case of the uniformisation theorem, it remains to prove Theorem 51. It is convenient to work with harmonic functions instead of holomorphic functions. Observe that if were holomorphic with a simple zero at but no other zeroes, then we have local holomorphic branches of on small neighbourhoods of any point in . Taking real parts, we conclude that the function is harmonic on the punctured surface ; it is also positive since takes values in . Furthermore, the function has a logarithmic singularity at in the following sense: if was any coordinate chart on some neighbourhood of that mapped to , then as had a simple zero at , the function , defined on , stays bounded as one approaches .
Conversely, one can reconstruct from (up to a harmless phase ) by the following lemma.
Lemma 53 (Reconstructing a holomorphic function from its magnitude) Let be a simply connected Riemann surface, let be a point in , and let be harmonic. Suppose that has a logarithmic singularity at in the sense that is bounded near for some coordinate chart on a neighbourhood of that maps to . Then there exists a holomorphic function with a simple zero at and no other zeroes, such that on .
Proof: Let be as above. Call a function on an open subset of good if it is holomorphic with on (in particular this forces to be non-zero away from ), and has a simple zero at if lies in . Clearly it will suffice to find a good function on all of .
We first solve the local problem, showing that for any there exists a neighbourhood of that supports a good function . If , we can work in a chart avoiding which is diffeomorphic to a disk . If we identify with then restricted to can be viewed as a harmonic function on . As this disk is simply connected, will have a harmonic conjugate and is thus the real part of a holomorphic function on this disk. Taking to be we obtain the required good function. Now suppose instead that . Using the coordinate chart to identify with , we now have a harmonic function with bounded near zero. Applying Exercise 59 of Notes 4, we conclude that extends to a holomorphic function on , which is then the real part of a holomorphic function ; taking then gives a good function on .
Next, we make the following compatibility observation: if and are both good functions, then is constant on every connected component of (after removing any singularity at ). Indeed, by construction is holomorphic and of magnitude one, so locally there are holomorphic branches of that have vanishing real part, hence locally constant imaginary part by the Cauchy-Riemann equations. Hence is locally constant as claimed.
Now we need to glue together the local good functions into a global good functions. This is a “monodromy problem”, which can be solved using analytic continuation and the simply connected nature of by the following “monodromy theorem” argument. Let us pick a good function on some neighbourhood of . Given any other point in , we can form a path from to . We claim that for any , we can find a finite sequence and good functions for such that each contains , and such that and agree on a neighbourhood of for each , and and also agree on a neighbourhood of . The set of such is easily seen to be an open non-empty subset of . Now we claim that it is closed. Suppose that converges to a limit as . If any of the are greater than or equal to it is easy to see that , so suppose instead that the are all less than . We take a good function supported on some neighbourhood of . By continuity, will contain for some sufficiently large . We would like to append and to the sequence of good functions , one obtains from the hypothesis , but there is the issue that need not agree with at the endpoint . However, they only differ by a constant of magnitude one near this endpoint, so after multiplying by an appropriate constant of magnitude one, we can conclude that as claimed.
By the continuity method, is all of , and in particular contains . Thus we can find and good functions for such that each contains , and such that and agree on a neighbourhood of for each , and and also agree on a neighbourhood of . Consider the final value obtained by the last good function at the endpoint of the curve . From analytic continuation and a continuity argument we see that if we perform a homotopy of with fixed endpoints, this final value does not change (even if the number of good functions may vary). Thus we can define a function by setting whenever is a path from to and is the final good function constructed by the above procedure. From construction we see that is locally equal to a good function at every point in , and is thus itself a good function, as required.
Exercise 54 (Monodromy theorem) Let be a simply connected Riemann surface, let be another Riemann surface, let be a point in , let be an open neighbourhood, and let be holomorphic. Prove that the following statements are equivalent.
- (i) has a holomorphic extension to ; that is to say, there is a holomorphic function whose restriction to is equal to .
- (ii) For every curve starting at , we can find and holomorphic functions for with , such that and agree on a neighbourhood for each .
Furthermore, if (i) holds, show that the holomorphic extension is unique. Give a counterexample that shows that the monodromy theorem fails if is only assumed to be connected rather than simply connected.
We remark that while the condition (ii) in the monodromy theorem looks somewhat complicated, it becomes more geometrically natural if one adopts the language of sheaves, which we will not do here.
In view of Lemma 53, we may reduce the task of establishing Theorem 51 to that of establishing the existence of a special type of harmonic function on (with one point removed), namely a Green’s function:
Definition 55 (Green’s function) Let be a connected Riemann surface, and let be a point in . A Green’s function for at is a function with the following properties:
- (i) is harmonic on .
- (ii) is non-negative on .
- (iii) has a logarithmic singularity at in the sense that is bounded near for some coordinate chart that maps to .
- (iv) is minimal with respect to the properties (i)-(iii), in the sense that for any other obeying (i)-(iii), we have pointwise in .
Clearly if a Green’s function for at exists, it is unique by property (iv), so we can talk about the Green’s function for at , if it exists. In the case of the disk , a Greens’ function may be explicitly computed:
Exercise 56 If , show that the function defined by is a Green’s function for at .
Theorem 51 may now be deduced from the following claim.
Proposition 57 (Existence of Green’s function) Let be a connected Riemann surface, let be a point in , and suppose that the collection of holomorphic maps that map to contains at least one non-constant map. Then the Green’s function for at exists. Furthermore, for any , one has for any .
(Note that in this proposition we no longer need to be simply connected.) Indeed, suppose that Proposition 57 held. Let be a simply connected Riemann surface, and let with containing a non-constant map. By hypothesis, the Green’s function is non-negative on . Noting that remains connected if we remove a small disk around , and from (iii) that will be strictly positive on the boundary of that disk, we observe from the maximum principle (Exercise 18) and (ii) that is in fact strictly positive on . By Lemma 53 we can find a holomorphic function with a simple zero at and no other zeroes, such that on . As is strictly positive, takes values in and is thus in . From Proposition 57 we see that for all . If equality occurs anywhere, then the quotient (after removing the singularity) is a function taking values in the closed unit disk , which has magnitude at ; by the maximum principle we then have for some real . Thus obeys all the properties required for Theorem 51.
It remains to obtain the existence of the Green’s function . To do this, we use a powerful technique for constructing harmonic functions, known as Perron’s method of subharmonic functions. The basic idea is to build a harmonic function by taking a suitable large family of subharmonic functions and then forming their supremum. We first give a definition of subharmonic function.
Definition 58 (Subharmonic function) Let be a Riemann surface. A subharmonic function on is an upper semi-continuous function obeying the following upper maximum principle: for any compact set in and any function that is continuous on and harmonic on the interior of , if for all , then for all .
A superharmonic function is similarly defined as a lower semi-continuous function such that for any compact and any function continuous on and harmonic on the interior of , the bound for implies that for all .
Clearly subharmonicity and superharmonicity are conformal invariants in the sense that the analogue of Lemma 1 holds for these concepts. We have the following elementary properties of subharmonic functions and superharmonic functions:
Exercise 59 Let be a Riemann surface.
- (i) Show that a function is subharmonic if and only if is superharmonic.
- (ii) Show that a function is harmonic if and only if it is both subharmonic and superharmonic.
- (iii) If are subharmonic, show that is also.
- (iv) Let , and let be an open subset of . Show that the restriction of to is subharmonic.
- (v) (Subharmonicity is a local property) Conversely, let , and suppose that for each there is a neighbourhood of such that the restriction of to is subharmonic. Show that is itself subharmonic. (Hint: If is continuous on a compact set and harmonic on the interior, and attains a maximum at an interior point of , show that is constant in some neighbourhood of that point.)
- (vi) (Maximum principle) Let be subharmonic, let be superharmonic, and let be a compact subset of such that for all . Show that for all . (This is a similar argument to (v).)
- (vii) Show that the sum of two subharmonic functions is again subharmonic (using the usual conventions on adding to itself or to another real number).
- (viii) (Harmonic patching) Let be subharmonic, let be compact, and let be a continuous function on that is harmonic on the interior of and agrees with on the boundary of . Show that the function , defined to equal on and on , is subharmonic.
- (ix) Let be a holomorphic function. Show that is subharmonic, with the convention that . (Hint: first use the maximum principle and harmonic conjugates to show that if contains a copy of a closed disk , and on the boundary of this disk for some continuous that is harmonic in the interior of the disk, then in the interior of the disk also.)
For smooth functions on an open subset of , one can express the property of subharmonicity quite explicitly:
Exercise 60 Let be an open subset of , and let be continuously twice (Fréchet) differentiable. Show that the following are equivalent:
- (i) is subharmonic.
- (ii) For all closed disks in , one has
- (iii) One has for all .
Show that the equivalence of (i) and (ii) in fact holds even if is only assumed to be continuous rather than continuously twice differentiable.
However, we will not use the above exercise in our analysis here as it will not be convenient to impose a hypothesis of continuous twice differentiability on our subharmonic functions.
The Perron method is based on the observation that under certain conditions, the supremum of a family of subharmonic functions is not just subharmonic (as per Exercise 59(iii)), but is in fact harmonic. A key concept here is that of a Perron family:
Definition 61 Let be a Riemann surface. A continuous Perron family on is a family of continuous subharmonic functions with the following properties:
- (i) If , then .
- (ii) (Harmonic patching) If , is a compact subset of , and is a continuous function that is harmonic in the interior of and equals on the boundary of , then the function defined to equal on and outside of also lies in .
One can also consider more general Perron families of subharmonic functions that are merely upper semi-continuous rather than continuous, but for the current application continuous Perron families will suffice.
The fundamental theorem that powers the Perron method is then
Theorem 62 (Perron method) Let be a continuous Perron family on a connected Riemann surface , and set to be the function (note that cannot equal thanks to axiom (iii) of a Perron family). Then one of the following two statements hold:
- (i) for all .
- (ii) is a harmonic function on .
Proof: Let us first work locally in some open subset of that is complex diffeomorphic to a disk ; to simplify the discussion we abuse notation by identifying with in the following discussion.
Assume for the moment that is not identically equal to on . Let be an arbitrary point in (viewed as a subset of ). Then we can find a sequence such that as .
We can use Exercise 42 to find a continuous function that equals on the boundary of this disk (viewed as a subset of ) and is harmonic and at least as large as in the interior; if we then let be the function defined to equal on and outside of this disk, then is larger than and also lies in thanks to axiom (ii). Thus, by replacing with , we may assume that is harmonic on . Next, by replacing with and using axiom (i), we may assume that pointwise; replacing with a harmonic function on as before we may assume that is harmonic on . Continuing in this fashion we may assume that and that are harmonic on . Form the function , then we have pointwise with . By the Harnack principle (Exercise 58 of Notes 4), we thus see that is either harmonic on , or equal to on . The latter cannot occur since we are assuming not identically equal to , thus is harmonic.
Now let be another point in . We can find another sequence with . As before we may assume that the are increasing and are harmonic on ; we may also assume that pointwise. Setting , we conclude that is harmonic with on . In particular . The harmonic function is non-negative on and vanishes at , hence is identically zero on by the maximum principle. Since , we conclude that and agree at . Since was an arbitrary point on , we conclude that is harmonic at .
Putting all this together, we see that for any point in there is a neighbourhood (corresponding to the disk in the above arguments) with the property that is either equal to on , or is harmonic on . By a continuity argument we conclude that one of the two options (i), (ii) of the theorem must hold.
Now we can conclude the proof of Proposition 57, and hence the hyperbolic case of the uniformisation theorem, by applying the above theorem to a well-chosen Perron family. Let be a simply connected Riemann surface, and let be the collection of all continuous subharmonic functions that vanishes outside of a compact subset of , and which have a logarithmic singularity at in the sense that is bounded near for some coordinate chart that takes to (note that the precise choice of chart here is irrelevant). This collection is non-empty, for it contains the function that equals (say) on , and zero elsewhere (this follows from the observation that is harmonic away from the origin, and is harmonic everywhere, as well as the various properties in Exercise 59). From Exercise 59 we see that is a Perron family; thus, by Theorem 62, the function is either harmonic on , or is infinite everywhere. Using the element of used above we see that is non-negative.
Let be an arbitrary element of . By Exercise 59(ix), is subharmonic, hence is superharmonic and also non-negative since takes values in ; as vanishes at , has at least a logarithmic singularity at in the sense that is bounded from below near . If , then vanishes outside of a compact set , hence outside of for any . As has a logarithmic singularity at we also have in a sufficiently small neighbourhood of . Appying the maximum principle (Exercise 59(vi)) we conclude that on all of ; sending to zero and then taking suprema in we conclude that
or equivalently
pointwise on . In particular, since contains at least one non-constant map, cannot be infinite everywhere and must therefore be harmonic.
Similarly, if is a function obeying the properties (i)-(iii) of a Green’s function, and , then another application of the maximum principle shows that on for any ; sending and taking suprema in we see that pointwise.
The only remaining task to show is that has a logarithmic singularity at . Certainly it has at least this much of a singularity, in that is bounded from below near , as can be seen by comparing to any element of . To get the upper bound, observe that for any and , the function is subharmonic on and diverges to at , and is hence in fact subharmonic on all of . In particular, for in the disk , we have from the maximum principle that
and hence on taking suprema in and limits in
The right-hand side is finite, and this gives the required upper bound to complete the proof that has a logarithmic singularity at . This concludes the proof of Proposition 57 and hence Theorem 49.
Before we turn to the non-hyperbolic case of the uniformisation theorem, we record a symmetry property of the Green’s functions that is used to establish that case:
Proposition 63 (Symmetry of Green’s functions) Let be a connected Riemann surface, and suppose that the Green’s functions exist for all . Then for all distinct , we have .
When is simply connected, this symmetry can be deduced from (7). For that are not simply connected, the argument is trickier, requiring one to pass to a universal cover of , establish the existence of Green’s functions on , and find an identity relating the Green’s functions on with the Green’s functions on . For details see Marshall’s notes.
Now we can discuss to the non-hyperbolic case of the uniformisation theorem, Theorem 50. Now we do not have any Green’s functions, or any non-constant bounded holomorphic functions. However, note that all three of the model Riemann surfaces , and still have plenty of meromorphic functions: in particular, for any two distinct points in , one can find a holomorphic function that has a simple zero at , a simple pole at , and no other zeroes and poles, namely ; one can think of this function with a zero-pole pair as a “dipole“. Similarly if one works on the domain or rather than . From this we see that Theorem 50 would imply the following claim:
Theorem 64 (Existence of dipoles) Let be a simply connected Riemann surface. Let be distinct points in . Then there exists a holomorphic map that has a simple zero at , a simple pole at , and no other zeroes and poles. Furthermore, outside of a compact set containing , the function can be chosen to be bounded away from both and (that is, there exists such that for all ).
In the converse direction, we can use Theorem 64 to recover Theorem 50 in a manner analogous to how Theorem 51 implies Theorem 49. Indeed, let be a simply connected Riemann surface without non-constant holomorphic maps from to . Given any three distinct points in , we consider the dipoles and . The function
has removable singularities at and at , no poles, and is also bounded away from a compact set. Thus this function extends to a bounded holomorphic function on . Since does not have any non-constant bounded holomorphic functions, the function (8) must be constant, thus for some complex numbers ; as is non-constant, must be non-zero. Since vanishes only at , we conclude that for any . Since also has its only zero at and its only pole at , we conclude that is injective. By Exercise 40 of Notes 4, is thus a complex diffeomorphism from to an open subset of , which of course is simply connected since is. If is all of then we are in the elliptic case and we are done. If omits at least one point in then by applying a Möbius transform is complex diffeomorphic to a simply connected open subset of ; by the Riemann mapping theorem, we conclude that is either complex diffeomorphic to or to . The latter case cannot occur by hypothesis, and we are done.
It remains to prove Theorem 64. As before, we convert the problem to one of finding a specific harmonic function. More precisely, one can derive Theorem 64 from
Theorem 65 (Existence of dipole Green’s functions) Let be a connected Riemann surface. Let be distinct points in , and let and be coordinate charts on disjoint neighbourhoods of respectively, which map and respectively to . Then there exists a harmonic function such that is bounded near , and is bounded near . Furthermore, is bounded outside of a compact subset of .
In the case , one can take the dipole Green’s function to be the function for an arbitrary constant .
Exercise 66 Adapt the proof of Lemma 53 to show that Theorem 65 implies Theorem 64 (and hence Theorem 50).
We still need to prove Theorem 65. If admitted Green’s functions for every point , we could simply take to be the difference . Unfortunately, as we are in the non-hyperbolic case, is not expected to have Green’s functions, and it does not appear possible to construct the dipole Green’s functions directly from Perron’s method due to the indefinite sign of these functions. However, it turns out that if one removes a small disk from of some small radius in a given coordinate chart, then the resulting Riemann surface will admit Green’s functions , and by considering limits of the sequence as using a version of Montel’s theorem one will be able to obtain the required dipole Green’s function, after first making heavy use of the maximum principle (and an important variant of that principle known as Harnack’s inequality, see Exercise 68 below) to obtain some locally uniform control on the difference in . To obtain this locally uniform control, the symmetry property in (63) is key, as it allows one to write
so that the main challenge is to show that the differences and are bounded uniformly in , which can be done from the maximum principle and the Harnack inequality. The details are unfortunately a little complicated, and we refer the reader to Marshall’s notes for the complete argument.
To close this section we give a quick corollary to the uniformisation theorem, namely Rado’s theorem on the topology of Riemann surfaces:
Corollary 67 (Rado’s theorem) Every connected Riemann surface is second countable and separable.
Proof: By passing to the universal cover, it suffices to verify this claim for simply connected Riemann surfaces. But the three model surfaces , , are clearly second countable and separable, so the claim follows from the uniformisation theorem.
It is remarkably difficult to prove this theorem directly, without going through the uniformisation theorem. (As just one indication of the difficulty of this theorem, the analogue of Rado’s theorem for complex manifolds in two and higher dimensions is known to be false.)
Exercise 68 (Harnack inequality) Let be a non-negative continuous function on a closed disk that is harmonic on the interior of the disk. Show that for every and , one has
(Hint: use Exercise 42.)
has very rapidly decaying coefficients (of order ), leading to an infinite radius of convergence; also, as the series converges to , the series decays very rapidly as approaches . The problem is whether this is essentially the only example of this type. More precisely:
Problem 1 Let be a bounded sequence of real numbers, and suppose that the power series
(which has an infinite radius of convergence) decays like as , in the sense that the function remains bounded as . Must the sequence be of the form for some constant ?
As it turns out, the problem has a very nice solution using complex analysis methods, which by coincidence I happen to be teaching right now. I am therefore posing as a challenge to my complex analysis students and to other readers of this blog to answer the above problem by complex methods; feel free to post solutions in the comments below (and in particular, if you don’t want to be spoiled, you should probably refrain from reading the comments). In fact, the only way I know how to solve this problem currently is by complex methods; I would be interested in seeing a purely real-variable solution that is not simply a thinly disguised version of a complex-variable argument.
(To be fair to my students, the complex variable argument does require one additional tool that is not directly covered in my notes. That tool can be found here.)