You are currently browsing the category archive for the ‘Mathematics’ category.

The self-chosen remit of my blog is “Updates on my research and expository papers, discussion of open problems, and other maths-related topics”.  Of the 774 posts on this blog, I estimate that about 99% of the posts indeed relate to mathematics, mathematicians, or the administration of this mathematical blog, and only about 1% are not related to mathematics or the community of mathematicians in any significant fashion.

This is not one of the 1%.

Mathematical research is clearly an international activity.  But actually a stronger claim is true: mathematical research is a transnational activity, in that the specific nationality of individual members of a research team or research community are (or should be) of no appreciable significance for the purpose of advancing mathematics.  For instance, even during the height of the Cold War, there was no movement in (say) the United States to boycott Soviet mathematicians or theorems, or to only use results from Western literature (though the latter did sometimes happen by default, due to the limited avenues of information exchange between East and West, and former did occasionally occur for political reasons, most notably with the Soviet Union preventing Gregory Margulis from traveling to receive his Fields Medal in 1978 EDIT: and also Sergei Novikov in 1970).    The national origin of even the most fundamental components of mathematics, whether it be the geometry (γεωμετρία) of the ancient Greeks, the algebra (الجبر) of the Islamic world, or the Hindu-Arabic numerals 0,1,\dots,9, are primarily of historical interest, and have only a negligible impact on the worldwide adoption of these mathematical tools. While it is true that individual mathematicians or research teams sometimes compete with each other to be the first to solve some desired problem, and that a citizen could take pride in the mathematical achievements of researchers from their country, one did not see any significant state-sponsored “space races” in which it was deemed in the national interest that a particular result ought to be proven by “our” mathematicians and not “theirs”.   Mathematical research ability is highly non-fungible, and the value added by foreign students and faculty to a mathematics department cannot be completely replaced by an equivalent amount of domestic students and faculty, no matter how large and well educated the country (though a state can certainly work at the margins to encourage and support more domestic mathematicians).  It is no coincidence that all of the top mathematics department worldwide actively recruit the best mathematicians regardless of national origin, and often retain immigration counsel to assist with situations in which these mathematicians come from a country that is currently politically disfavoured by their own.

Of course, mathematicians cannot ignore the political realities of the modern international order altogether.  Anyone who has organised an international conference or program knows that there will inevitably be visa issues to resolve because the host country makes it particularly difficult for certain nationals to attend the event.  I myself, like many other academics working long-term in the United States, have certainly experienced my own share of immigration bureaucracy, starting with various glitches in the renewal or application of my J-1 and O-1 visas, then to the lengthy vetting process for acquiring permanent residency (or “green card”) status, and finally to becoming naturalised as a US citizen (retaining dual citizenship with Australia).  Nevertheless, while the process could be slow and frustrating, there was at least an order to it.  The rules of the game were complicated, but were known in advance, and did not abruptly change in the middle of playing it (save in truly exceptional situations, such as the days after the September 11 terrorist attacks).  One just had to study the relevant visa regulations (or hire an immigration lawyer to do so), fill out the paperwork and submit to the relevant background checks, and remain in good standing until the application was approved in order to study, work, or participate in a mathematical activity held in another country.  On rare occasion, some senior university administrator may have had to contact a high-ranking government official to approve some particularly complicated application, but for the most part one could work through normal channels in order to ensure for instance that the majority of participants of a conference could actually be physically present at that conference, or that an excellent mathematician hired by unanimous consent by a mathematics department could in fact legally work in that department.

With the recent and highly publicised executive order on immigration, many of these fundamental assumptions have been seriously damaged, if not destroyed altogether.  Even if the order was withdrawn immediately, there is no longer an assurance, even for nationals not initially impacted by that order, that some similar abrupt and major change in the rules for entry to the United States could not occur, for instance for a visitor who has already gone through the lengthy visa application process and background checks, secured the appropriate visa, and is already in flight to the country.  This is already affecting upcoming or ongoing mathematical conferences or programs in the US, with many international speakers (including those from countries not directly affected by the order) now cancelling their visit, either in protest or in concern about their ability to freely enter and leave the country.  Even some conferences outside the US are affected, as some mathematicians currently in the US with a valid visa or even permanent residency are uncertain if they could ever return back to their place of work if they left the country to attend a meeting.  In the slightly longer term, it is likely that the ability of elite US institutions to attract the best students and faculty will be seriously impacted.  Again, the losses would be strongest regarding candidates that were nationals of the countries affected by the current executive order, but I fear that many other mathematicians from other countries would now be much more concerned about entering and living in the US than they would have previously.

It is still possible for this sort of long-term damage to the mathematical community (both within the US and abroad) to be reversed or at least contained, but at present there is a real risk of the damage becoming permanent.  To prevent this, it seems insufficient for me for the current order to be rescinded, as desirable as that would be; some further legislative or judicial action would be needed to begin restoring enough trust in the stability of the US immigration and visa system that the international travel that is so necessary to modern mathematical research becomes “just” a bureaucratic headache again.

Of course, the impact of this executive order is far, far broader than just its effect on mathematicians and mathematical research.  But there are countless other venues on the internet and elsewhere to discuss these other aspects (or politics in general).  (For instance, discussion of the qualifications, or lack thereof, of the current US president can be carried out at this previous post.) I would therefore like to open this post to readers to discuss the effects or potential effects of this order on the mathematical community; I particularly encourage mathematicians who have been personally affected by this order to share their experiences.  As per the rules of the blog, I request that “the discussions are kept constructive, polite, and at least tangentially relevant to the topic at hand”.

Some relevant links (please feel free to suggest more, either through comments or by email):

I’ve just uploaded to the arXiv my paper “Some remarks on the lonely runner conjecture“, submitted to Contributions to discrete mathematics. I had blogged about the lonely runner conjecture in this previous blog post, and I returned to the problem recently to see if I could obtain anything further. The results obtained were more modest than I had hoped, but they did at least seem to indicate a potential strategy to make further progress on the problem, and also highlight some of the difficulties of the problem.

One can rephrase the lonely runner conjecture as the following covering problem. Given any integer “velocity” {v} and radius {0 < \delta < 1/2}, define the Bohr set {B(v,\delta)} to be the subset of the unit circle {{\bf R}/{\bf Z}} given by the formula

\displaystyle B(v,\delta) := \{ t \in {\bf R}/{\bf Z}: \|vt\| \leq \delta \},

where {\|x\|} denotes the distance of {x} to the nearest integer. Thus, for {v} positive, {B(v,\delta)} is simply the union of the {v} intervals {[\frac{a-\delta}{v}, \frac{a+\delta}{v}]} for {a=0,\dots,v-1}, projected onto the unit circle {{\bf R}/{\bf Z}}; in the language of the usual formulation of the lonely runner conjecture, {B(v,\delta)} represents those times in which a runner moving at speed {v} returns to within {\delta} of his or her starting position. For any non-zero integers {v_1,\dots,v_n}, let {\delta(v_1,\dots,v_n)} be the smallest radius {\delta} such that the {n} Bohr sets {B(v_1,\delta),\dots,B(v_n,\delta)} cover the unit circle:

\displaystyle {\bf R}/{\bf Z} = \bigcup_{i=1}^n B(v_i,\delta). \ \ \ \ \ (1)

 

Then define {\delta_n} to be the smallest value of {\delta(v_1,\dots,v_n)}, as {v_1,\dots,v_n} ranges over tuples of distinct non-zero integers. The Dirichlet approximation theorem quickly gives that

\displaystyle \delta(1,\dots,n) = \frac{1}{n+1}

and hence

\displaystyle \delta_n \leq \frac{1}{n+1}

for any {n \geq 1}. The lonely runner conjecture is equivalent to the assertion that this bound is in fact optimal:

Conjecture 1 (Lonely runner conjecture) For any {n \geq 1}, one has {\delta_n = \frac{1}{n+1}}.

This conjecture is currently known for {n \leq 6} (see this paper of Barajas and Serra), but remains open for higher {n}.

It is natural to try to attack the problem by establishing lower bounds on the quantity {\delta_n}. We have the following “trivial” bound, that gets within a factor of two of the conjecture:

Proposition 2 (Trivial bound) For any {n \geq 1}, one has {\delta_n \geq \frac{1}{2n}}.

Proof: It is not difficult to see that for any non-zero velocity {v} and any {0 < \delta < 1/2}, the Bohr set {B(v,\delta)} has Lebesgue measure {m(B(v,\delta)) = 2\delta}. In particular, by the union bound

\displaystyle m(\bigcup_{i=1}^n B(v_i,\delta)) \leq \sum_{i=1}^n m(B(v_i,\delta)) \ \ \ \ \ (2)

 

we see that the covering (1) is only possible if {1 \leq 2 n \delta}, giving the claim. \Box

So, in some sense, all the difficulty is coming from the need to improve upon the trivial union bound (2) by a factor of two.

Despite the crudeness of the union bound (2), it has proven surprisingly hard to make substantial improvements on the trivial bound {\delta_n \geq \frac{1}{2n}}. In 1994, Chen obtained the slight improvement

\displaystyle \delta_n \geq \frac{1}{2n - 1 + \frac{1}{2n-3}}

which was improved a little by Chen and Cusick in 1999 to

\displaystyle \delta_n \geq \frac{1}{2n-3}

when {2n-3} was prime. In a recent paper of Perarnau and Serra, the bound

\displaystyle \delta_n \geq \frac{1}{2n-2+o(1)}

was obtained for arbitrary {n}. These bounds only improve upon the trivial bound by a multiplicative factor of {1+O(1/n)}. Heuristically, one reason for this is as follows. The union bound (2) would of course be sharp if the Bohr sets {B(v_i,\delta)} were all disjoint. Strictly speaking, such disjointness is not possible, because all the Bohr sets {B(v_i,\delta)} have to contain the origin as an interior point. However, it is possible to come up with a large number of Bohr sets {B(v_i,\delta)} which are almost disjoint. For instance, suppose that we had velocities {v_1,\dots,v_s} that were all prime numbers between {n/4} and {n/2}, and that {\delta} was equal to {\delta_n} (and in particular was between {1/2n} and {1/(n+1)}. Then each set {B(v_i,\delta)} can be split into a “kernel” interval {[-\frac{\delta}{v_i}, \frac{\delta}{v_i}]}, together with the “petal” intervals {\bigcup_{a=1}^{v_i-1} [\frac{a-\delta}{v_i}, \frac{a+\delta}{v_i}]}. Roughly speaking, as the prime {v_i} varies, the kernel interval stays more or less fixed, but the petal intervals range over disjoint sets, and from this it is not difficult to show that

\displaystyle m(\bigcup_{i=1}^s B(v_i,\delta)) = (1-O(\frac{1}{n})) \sum_{i=1}^s m(B(v_i,\delta)),

so that the union bound is within a multiplicative factor of {1+O(\frac{1}{n})} of the truth in this case.

This does not imply that {\delta_n} is within a multiplicative factor of {1+O(1/n)} of {\frac{1}{2n}}, though, because there are not enough primes between {n/4} and {n/2} to assign to {n} distinct velocities; indeed, by the prime number theorem, there are only about {\frac{n}{4\log n}} such velocities that could be assigned to a prime. So, while the union bound could be close to tight for up to {\asymp n/\log n} Bohr sets, the above counterexamples don’t exclude improvements to the union bound for larger collections of Bohr sets. Following this train of thought, I was able to obtain a logarithmic improvement to previous lower bounds:

Theorem 3 For sufficiently large {n}, one has {\delta_n \geq \frac{1}{2n} + \frac{c \log n}{n^2 (\log\log n)^2}} for some absolute constant {c>0}.

The factors of {\log\log n} in the denominator are for technical reasons and might perhaps be removable by a more careful argument. However it seems difficult to adapt the methods to improve the {\log n} in the numerator, basically because of the obstruction provided by the near-counterexample discussed above.

Roughly speaking, the idea of the proof of this theorem is as follows. If we have the covering (1) for {\delta} very close to {1/2n}, then the multiplicity function {\sum_{i=1}^n 1_{B(v_i,\delta)}} will then be mostly equal to {1}, but occasionally be larger than {1}. On the other hand, one can compute that the {L^2} norm of this multiplicity function is significantly larger than {1} (in fact it is at least {(3/2-o(1))^{1/2}}). Because of this, the {L^3} norm must be very large, which means that the triple intersections {B(v_i,\delta) \cap B(v_j,\delta) \cap B(v_k,\delta)} must be quite large for many triples {(i,j,k)}. Using some basic Fourier analysis and additive combinatorics, one can deduce from this that the velocities {v_1,\dots,v_n} must have a large structured component, in the sense that there exists an arithmetic progression of length {\asymp n} that contains {\asymp n} of these velocities. For simplicity let us take the arithmetic progression to be {\{1,\dots,n\}}, thus {\asymp n} of the velocities {v_1,\dots,v_n} lie in {\{1,\dots,n\}}. In particular, from the prime number theorem, most of these velocities will not be prime, and will in fact likely have a “medium-sized” prime factor (in the precise form of the argument, “medium-sized” is defined to be “between {\log^{10} n} and {n^{1/10}}“). Using these medium-sized prime factors, one can show that many of the {B(v_i,\delta)} will have quite a large overlap with many of the other {B(v_j,\delta)}, and this can be used after some elementary arguments to obtain a more noticeable improvement on the union bound (2) than was obtained previously.

A modification of the above argument also allows for the improved estimate

\displaystyle \delta(v_1,\dots,v_n) \geq \frac{1+c-o(1)}{2n} \ \ \ \ \ (3)

 

if one knows that all of the velocities {v_1,\dots,v_n} are of size {O(n)}.

In my previous blog post, I showed that in order to prove the lonely runner conjecture, it suffices to do so under the additional assumption that all of the velocities {v_1,\dots,v_n} are of size {O(n^{O(n^2)})}; I reproduce this argument (slightly cleaned up for publication) in the current preprint. There is unfortunately a huge gap between {O(n)} and {O(n^{O(n^2)})}, so the above bound (3) does not immediately give any new bounds for {\delta_n}. However, one could perhaps try to start attacking the lonely runner conjecture by increasing the range {O(n)} for which one has good results, and by decreasing the range {O(n^{O(n^2)})} that one can reduce to. For instance, in the current preprint I give an elementary argument (using a certain amount of case-checking) that shows that the lonely runner bound

\displaystyle \delta(v_1,\dots,v_n) \geq \frac{1}{n+1} \ \ \ \ \ (4)

 

holds if all the velocities {v_1,\dots,v_n} are assumed to lie between {1} and {1.2 n}. This upper threshold of {1.2 n} is only a tiny improvement over the trivial threshold of {n}, but it seems to be an interesting sub-problem of the lonely runner conjecture to increase this threshold further. One key target would be to get up to {2n}, as there are actually a number of {n}-tuples {(v_1,\dots,v_n)} in this range for which (4) holds with equality. The Dirichlet approximation theorem of course gives the tuple {(1,2,\dots,n)}, but there is also the double {(2,4,\dots,2n)} of this tuple, and furthermore there is an additional construction of Goddyn and Wong that gives some further examples such as {(1,2,3,4,5,7,12)}, or more generally one can start with the standard tuple {(1,\dots,n)} and accelerate one of the velocities {v} to {2v}; this turns out to work as long as {v} shares a common factor with every integer between {n-v+1} and {2n-2v+1}. There are a few more examples of this type in the paper of Goddyn and Wong, but all of them can be placed in an arithmetic progression of length {O(n \log n)} at most, so if one were very optimistic, one could perhaps envision a strategy in which the upper bound of {O(n^{O(n^2)})} mentioned earlier was reduced all the way to something like {O( n \log n )}, and then a separate argument deployed to treat this remaining case, perhaps isolating the constructions of Goddyn and Wong (and possible variants thereof) as the only extreme cases.

I just learned (from Emmanuel Kowalski’s blog) that the AMS has just started a repository of open-access mathematics lecture notes.  There are only a few such sets of notes there at present, but hopefully it will grow in the future; I just submitted some old lecture notes of mine from an undergraduate linear algebra course I taught in 2002 (with some updating of format and fixing of various typos).

 

[Update, Dec 22: my own notes are now on the repository.]

I’ve just uploaded to the arXiv my paper “An integration approach to the Toeplitz square peg problem“, submitted to Forum of Mathematics, Sigma. This paper resulted from my attempts recently to solve the Toeplitz square peg problem (also known as the inscribed square problem):

Conjecture 1 (Toeplitz square peg problem) Let {\gamma} be a simple closed curve in the plane. Is it necessarily the case that {\gamma} contains four vertices of a square?

See this recent survey of Matschke in the Notices of the AMS for the latest results on this problem.

The route I took to the results in this paper was somewhat convoluted. I was motivated to look at this problem after lecturing recently on the Jordan curve theorem in my class. The problem is superficially similar to the Jordan curve theorem in that the result is known (and rather easy to prove) if {\gamma} is sufficiently regular (e.g. if it is a polygonal path), but seems to be significantly more difficult when the curve is merely assumed to be continuous. Roughly speaking, all the known positive results on the problem have proceeded using (in some form or another) tools from homology: note for instance that one can view the conjecture as asking whether the four-dimensional subset {\gamma^4} of the eight-dimensional space {({\bf R}^2)^4} necessarily intersects the four-dimensional space {\mathtt{Squares} \subset ({\bf R}^2)^4} consisting of the quadruples {(v_1,v_2,v_3,v_4)} traversing a square in (say) anti-clockwise order; this space is a four-dimensional linear subspace of {({\bf R}^2)^4}, with a two-dimensional subspace of “degenerate” squares {(v,v,v,v)} removed. If one ignores this degenerate subspace, one can use intersection theory to conclude (under reasonable “transversality” hypotheses) that {\gamma^4} intersects {\mathtt{Squares}} an odd number of times (up to the cyclic symmetries of the square), which is basically how Conjecture 1 is proven in the regular case. Unfortunately, if one then takes a limit and considers what happens when {\gamma} is just a continuous curve, the odd number of squares created by these homological arguments could conceivably all degenerate to points, thus blocking one from proving the conjecture in the general case.

Inspired by my previous work on finite time blowup for various PDEs, I first tried looking for a counterexample in the category of (locally) self-similar curves that are smooth (or piecewise linear) away from a single origin where it can oscillate infinitely often; this is basically the smoothest type of curve that was not already covered by previous results. By a rescaling and compactness argument, it is not difficult to see that such a counterexample would exist if there was a counterexample to the following periodic version of the conjecture:

Conjecture 2 (Periodic square peg problem) Let {\gamma_1, \gamma_2} be two disjoint simple closed piecewise linear curves in the cylinder {({\bf R}/{\bf Z}) \times {\bf R}} which have a winding number of one, that is to say they are homologous to the loop {x \mapsto (x,0)} from {{\bf R}/{\bf Z}} to {({\bf R}/{\bf Z}) \times {\bf R}}. Then the union of {\gamma_1} and {\gamma_2} contains the four vertices of a square.

In contrast to Conjecture 1, which is known for polygonal paths, Conjecture 2 is still open even under the hypothesis of polygonal paths; the homological arguments alluded to previously now show that the number of inscribed squares in the periodic setting is even rather than odd, which is not enough to conclude the conjecture. (This flipping of parity from odd to even due to an infinite amount of oscillation is reminiscent of the “Eilenberg-Mazur swindle“, discussed in this previous post.)

I therefore tried to construct counterexamples to Conjecture 2. I began perturbatively, looking at curves {\gamma_1, \gamma_2} that were small perturbations of constant functions. After some initial Taylor expansion, I was blocked from forming such a counterexample because an inspection of the leading Taylor coefficients required one to construct a continuous periodic function of mean zero that never vanished, which of course was impossible by the intermediate value theorem. I kept expanding to higher and higher order to try to evade this obstruction (this, incidentally, was when I discovered this cute application of Lagrange reversion) but no matter how high an accuracy I went (I think I ended up expanding to sixth order in a perturbative parameter {\varepsilon} before figuring out what was going on!), this obstruction kept resurfacing again and again. I eventually figured out that this obstruction was being caused by a “conserved integral of motion” for both Conjecture 2 and Conjecture 1, which can in fact be used to largely rule out perturbative constructions. This yielded a new positive result for both conjectures:

Theorem 3

  • (i) Conjecture 1 holds when {\gamma} is the union {\{ (t,f(t)): t \in [t_0,t_1]\} \cup \{ (t,g(t)): t \in [t_0,t_1]\}} of the graphs of two Lipschitz functions {f,g: [t_0,t_1] \rightarrow {\bf R}} of Lipschitz constant less than one that agree at the endpoints.
  • (ii) Conjecture 2 holds when {\gamma_1, \gamma_2} are graphs of Lipschitz functions {f: {\bf R}/{\bf Z} \rightarrow {\bf R}, g: {\bf R}/{\bf Z} \rightarrow {\bf R}} of Lipschitz constant less than one.

We sketch the proof of Theorem 3(i) as follows (the proof of Theorem 3(ii) is very similar). Let {\gamma_1: [t_0, t_1] \rightarrow {\bf R}} be the curve {\gamma_1(t) := (t,f(t))}, thus {\gamma_1} traverses one of the two graphs that comprise {\gamma}. For each time {t \in [t_0,t_1]}, there is a unique square with first vertex {\gamma_1(t)} (and the other three vertices, traversed in anticlockwise order, denoted {\gamma_2(t), \gamma_3(t), \gamma_4(t)}) such that {\gamma_2(t)} also lies in the graph of {f} and {\gamma_4(t)} also lies in the graph of {g} (actually for technical reasons we have to extend {f,g} by constants to all of {{\bf R}} in order for this claim to be true). To see this, we simply rotate the graph of {g} clockwise by {\frac{\pi}{2}} around {\gamma_1(t)}, where (by the Lipschitz hypotheses) it must hit the graph of {f} in a unique point, which is {\gamma_2(t)}, and which then determines the other two vertices {\gamma_3(t), \gamma_4(t)} of the square. The curve {\gamma_3(t)} has the same starting and ending point as the graph of {f} or {g}; using the Lipschitz hypothesis one can show this graph is simple. If the curve ever hits the graph of {g} other than at the endpoints, we have created an inscribed square, so we may assume for contradiction that {\gamma_3(t)} avoids the graph of {g}, and hence by the Jordan curve theorem the two curves enclose some non-empty bounded open region {\Omega}.

Now for the conserved integral of motion. If we integrate the {1}-form {y\ dx} on each of the four curves {\gamma_1, \gamma_2, \gamma_3, \gamma_4}, we obtain the identity

\displaystyle  \int_{\gamma_1} y\ dx - \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx - \int_{\gamma_4} y\ dx = 0.

This identity can be established by the following calculation: one can parameterise

\displaystyle  \gamma_1(t) = (x(t), y(t))

\displaystyle  \gamma_2(t) = (x(t)+a(t), y(t)+b(t))

\displaystyle  \gamma_3(t) = (x(t)+a(t)-b(t), y(t)+a(t)+b(t))

\displaystyle  \gamma_4(t) = (x(t)-b(t), y(t)+a(t))

for some Lipschitz functions {x,y,a,b: [t_0,t_1] \rightarrow {\bf R}}; thus for instance {\int_{\gamma_1} y\ dx = \int_{t_0}^{t_1} y(t)\ dx(t)}. Inserting these parameterisations and doing some canceling, one can write the above integral as

\displaystyle  \int_{t_0}^{t_1} d \frac{a(t)^2-b(t)^2}{2}

which vanishes because {a(t), b(t)} (which represent the sidelengths of the squares determined by {\gamma_1(t), \gamma_2(t), \gamma_3(t), \gamma_4(t)} vanish at the endpoints {t=t_0,t_1}.

Using this conserved integral of motion, one can show that

\displaystyle  \int_{\gamma_3} y\ dx = \int_{t_0}^{t_1} g(t)\ dt

which by Stokes’ theorem then implies that the bounded open region {\Omega} mentioned previously has zero area, which is absurd.

This argument hinged on the curve {\gamma_3} being simple, so that the Jordan curve theorem could apply. Once one left the perturbative regime of curves of small Lipschitz constant, it became possible for {\gamma_3} to be self-crossing, but nevertheless there still seemed to be some sort of integral obstruction. I eventually isolated the problem in the form of a strengthened version of Conjecture 2:

Conjecture 4 (Area formulation of square peg problem) Let {\gamma_1, \gamma_2, \gamma_3, \gamma_4: {\bf R}/{\bf Z} \rightarrow ({\bf R}/{\bf Z}) \times {\bf R}} be simple closed piecewise linear curves of winding number {1} obeying the area identity

\displaystyle  \int_{\gamma_1} y\ dx - \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx - \int_{\gamma_4} y\ dx = 0

(note the {1}-form {y\ dx} is still well defined on the cylinder {({\bf R}/{\bf Z}) \times {\bf R}}; note also that the curves {\gamma_1,\gamma_2,\gamma_3,\gamma_4} are allowed to cross each other.) Then there exists a (possibly degenerate) square with vertices (traversed in anticlockwise order) lying on {\gamma_1, \gamma_2, \gamma_3, \gamma_4} respectively.

It is not difficult to see that Conjecture 4 implies Conjecture 2. Actually I believe that the converse implication is at least morally true, in that any counterexample to Conjecture 4 can be eventually transformed to a counterexample to Conjecture 2 and Conjecture 1. The conserved integral of motion argument can establish Conjecture 4 in many cases, for instance if {\gamma_2,\gamma_4} are graphs of functions of Lipschitz constant less than one.

Conjecture 4 has a model special case, when one of the {\gamma_i} is assumed to just be a horizontal loop. In this case, the problem collapses to that of producing an intersection between two three-dimensional subsets of a six-dimensional space, rather than to four-dimensional subsets of an eight-dimensional space. More precisely, some elementary transformations reveal that this special case of Conjecture 4 can be formulated in the following fashion in which the geometric notion of a square is replaced by the additive notion of a triple of real numbers summing to zero:

Conjecture 5 (Special case of area formulation) Let {\gamma_1, \gamma_2, \gamma_3: {\bf R}/{\bf Z} \rightarrow ({\bf R}/{\bf Z}) \times {\bf R}} be simple closed piecewise linear curves of winding number {1} obeying the area identity

\displaystyle  \int_{\gamma_1} y\ dx + \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx = 0.

Then there exist {x \in {\bf R}/{\bf Z}} and {y_1,y_2,y_3 \in {\bf R}} with {y_1+y_2+y_3=0} such that {(x,y_i) \in \gamma_i} for {i=1,2,3}.

This conjecture is easy to establish if one of the curves, say {\gamma_3}, is the graph {\{ (t,f(t)): t \in {\bf R}/{\bf Z}\}} of some piecewise linear function {f: {\bf R}/{\bf Z} \rightarrow {\bf R}}, since in that case the curve {\gamma_1} and the curve {\tilde \gamma_2 := \{ (x, -y-f(x)): (x,y) \in \gamma_2 \}} enclose the same area in the sense that {\int_{\gamma_1} y\ dx = \int_{\tilde \gamma_2} y\ dx}, and hence must intersect by the Jordan curve theorem (otherwise they would enclose a non-zero amount of area between them), giving the claim. But when none of the {\gamma_1,\gamma_2,\gamma_3} are graphs, the situation becomes combinatorially more complicated.

Using some elementary homological arguments (e.g. breaking up closed {1}-cycles into closed paths) and working with a generic horizontal slice of the curves, I was able to show that Conjecture 5 was equivalent to a one-dimensional problem that was largely combinatorial in nature, revolving around the sign patterns of various triple sums {y_{1,a} + y_{2,b} + y_{3,c}} with {y_{1,a}, y_{2,b}, y_{3,c}} drawn from various finite sets of reals.

Conjecture 6 (Combinatorial form) Let {k_1,k_2,k_3} be odd natural numbers, and for each {i=1,2,3}, let {y_{i,1},\dots,y_{i,k_i}} be distinct real numbers; we adopt the convention that {y_{i,0}=y_{i,k_i+1}=-\infty}. Assume the following axioms:

  • (i) For any {1 \leq p \leq k_1, 1 \leq q \leq k_2, 1 \leq r \leq k_3}, the sums {y_{1,p} + y_{2,q} + y_{3,r}} are non-zero.
  • (ii) (Non-crossing) For any {i=1,2,3} and {0 \leq p < q \leq k_i} with the same parity, the pairs {\{ y_{i,p}, y_{i,p+1}\}} and {\{y_{i,q}, y_{i,q+1}\}} are non-crossing in the sense that

    \displaystyle  \sum_{a \in \{p,p+1\}} \sum_{b \in \{q,q+1\}} (-1)^{a+b} \mathrm{sgn}( y_{i,a} - y_{i,b} ) = 0.

  • (iii) (Non-crossing sums) For any {0 \leq p \leq k_1}, {0 \leq q \leq k_2}, {0 \leq r \leq k_3} of the same parity, one has

    \displaystyle  \sum_{a \in \{p,p+1\}} \sum_{b \in \{q,q+1\}} \sum_{c \in \{r,r+1\}} (-1)^{a+b+c} \mathrm{sgn}( y_{1,a} + y_{2,b} + y_{3,c} ) = 0.

Then one has

\displaystyle  \sum_{i=1}^3 \sum_{p=1}^{k_i} (-1)^{p-1} y_{i,p} < 0.

Roughly speaking, Conjecture 6 and Conjecture 5 are connected by constructing curves {\gamma_i} to connect {(0, y_{i,p})} to {(0,y_{i,p+1})} for {0 \leq p \leq k+1} by various paths, which either lie to the right of the {y} axis (when {p} is odd) or to the left of the {y} axis (when {p} is even). The axiom (ii) is asserting that the numbers {-\infty, y_{i,1},\dots,y_{i,k_i}} are ordered according to the permutation of a meander (formed by gluing together two non-crossing perfect matchings).

Using various ad hoc arguments involving “winding numbers”, it is possible to prove this conjecture in many cases (e.g. if one of the {k_i} is at most {3}), to the extent that I have now become confident that this conjecture is true (and have now come full circle from trying to disprove Conjecture 1 to now believing that this conjecture holds also). But it seems that there is some non-trivial combinatorial argument to be made if one is to prove this conjecture; purely homological arguments seem to partially resolve the problem, but are not sufficient by themselves.

While I was not able to resolve the square peg problem, I think these results do provide a roadmap to attacking it, first by focusing on the combinatorial conjecture in Conjecture 6 (or its equivalent form in Conjecture 5), then after that is resolved moving on to Conjecture 4, and then finally to Conjecture 1.

By an odd coincidence, I stumbled upon a second question in as many weeks about power series, and once again the only way I know how to prove the result is by complex methods; once again, I am leaving it here as a challenge to any interested readers, and I would be particularly interested in knowing of a proof that was not based on complex analysis (or thinly disguised versions thereof), or for a reference to previous literature where something like this identity has occured. (I suspect for instance that something like this may have shown up before in free probability, based on the answer to part (ii) of the problem.)

Here is a purely algebraic form of the problem:

Problem 1 Let {F = F(z)} be a formal function of one variable {z}. Suppose that {G = G(z)} is the formal function defined by

\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

\displaystyle = F + \left(\frac{F^2}{2}\right)' + \left(\frac{F^3}{6}\right)'' + \dots

\displaystyle = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + \dots,

where we use {f^{(k)}} to denote the {k}-fold derivative of {f} with respect to the variable {z}.

  • (i) Show that {F} can be formally recovered from {G} by the formula

    \displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

    \displaystyle = G - \left(\frac{G^2}{2}\right)' + \left(\frac{G^3}{6}\right)'' - \dots

    \displaystyle = G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') - \dots.

  • (ii) There is a remarkable further formal identity relating {F(z)} with {G(z)} that does not explicitly involve any infinite summation. What is this identity?

To rigorously formulate part (i) of this problem, one could work in the commutative differential ring of formal infinite series generated by polynomial combinations of {F} and its derivatives (with no constant term). Part (ii) is a bit trickier to formulate in this abstract ring; the identity in question is easier to state if {F, G} are formal power series, or (even better) convergent power series, as it involves operations such as composition or inversion that can be more easily defined in those latter settings.

To illustrate Problem 1(i), let us compute up to third order in {F}, using {{\mathcal O}(F^4)} to denote any quantity involving four or more factors of {F} and its derivatives, and similarly for other exponents than {4}. Then we have

\displaystyle G = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + {\mathcal O}(F^4)

and hence

\displaystyle G' = F' + (F')^2 + FF'' + {\mathcal O}(F^3)

\displaystyle G'' = F'' + {\mathcal O}(F^2);

multiplying, we have

\displaystyle GG' = FF' + F (F')^2 + F^2 F'' + F (F')^2 + {\mathcal O}(F^4)

and

\displaystyle G (G')^2 + \frac{1}{2} G^2 G'' = F (F')^2 + \frac{1}{2} F^2 F'' + {\mathcal O}(F^4)

and hence after a lot of canceling

\displaystyle G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') = F + {\mathcal O}(F^4).

Thus Problem 1(i) holds up to errors of {{\mathcal O}(F^4)} at least. In principle one can continue verifying Problem 1(i) to increasingly high order in {F}, but the computations rapidly become quite lengthy, and I do not know of a direct way to ensure that one always obtains the required cancellation at the end of the computation.

Problem 1(i) can also be posed in formal power series: if

\displaystyle F(z) = a_1 z + a_2 z^2 + a_3 z^3 + \dots

is a formal power series with no constant term with complex coefficients {a_1, a_2, \dots} with {|a_1|<1}, then one can verify that the series

\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

makes sense as a formal power series with no constant term, thus

\displaystyle G(z) = b_1 z + b_2 z^2 + b_3 z^3 + \dots.

For instance it is not difficult to show that {b_1 = \frac{a_1}{1-a_1}}. If one further has {|b_1| < 1}, then it turns out that

\displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

as formal power series. Currently the only way I know how to show this is by first proving the claim for power series with a positive radius of convergence using the Cauchy integral formula, but even this is a bit tricky unless one has managed to guess the identity in (ii) first. (In fact, the way I discovered this problem was by first trying to solve (a variant of) the identity in (ii) by Taylor expansion in the course of attacking another problem, and obtaining the transform in Problem 1 as a consequence.)

The transform that takes {F} to {G} resembles both the exponential function

\displaystyle \exp(F) = \sum_{n=0}^\infty \frac{F^n}{n!}

and Taylor’s formula

\displaystyle F(z) = \sum_{n=0}^\infty \frac{F^{(n)}(0)}{n!} z^n

but does not seem to be directly connected to either (this is more apparent once one knows the identity in (ii)).

In the previous set of notes we introduced the notion of a complex diffeomorphism {f: U \rightarrow V} between two open subsets {U,V} of the complex plane {{\bf C}} (or more generally, two Riemann surfaces): an invertible holomorphic map whose inverse was also holomorphic. (Actually, the last part is automatic, thanks to Exercise 40 of Notes 4.) Such maps are also known as biholomorphic maps or conformal maps (although in some literature the notion of “conformal map” is expanded to permit maps such as the complex conjugation map {z \mapsto \overline{z}} that are angle-preserving but not orientation-preserving, as well as maps such as the exponential map {z \mapsto \exp(z)} from {{\bf C}} to {{\bf C} \backslash \{0\}} that are only locally injective rather than globally injective). Such complex diffeomorphisms can be used in complex analysis (or in the analysis of harmonic functions) to change the underlying domain {U} to a domain that may be more convenient for calculations, thanks to the following basic lemma:

Lemma 1 (Holomorphicity and harmonicity are conformal invariants) Let {\phi: U \rightarrow V} be a complex diffeomorphism between two Riemann surfaces {U,V}.

  • (i) If {f: V \rightarrow W} is a function to another Riemann surface {W}, then {f} is holomorphic if and only if {f \circ \phi: U \rightarrow W} is holomorphic.
  • (ii) If {U,V} are open subsets of {{\bf C}} and {u: V \rightarrow {\bf R}} is a function, then {u} is harmonic if and only if {u \circ \phi: U \rightarrow {\bf R}} is harmonic.

Proof: Part (i) is immediate since the composition of two holomorphic functions is holomorphic. For part (ii), observe that if {u: V \rightarrow {\bf R}} is harmonic then on any ball {B(z_0,r)} in {V}, {u} is the real part of some holomorphic function {f: B(z_0,r) \rightarrow {\bf C}} thanks to Exercise 62 of Notes 3. By part (i), {f \circ \phi: B(z_0,r) \rightarrow {\bf C}} is also holomorphic. Taking real parts we see that {u \circ \phi} is harmonic on each ball {B(z_0,r)} in {V}, and hence harmonic on all of {V}, giving one direction of (ii); the other direction is proven similarly. \Box

Exercise 2 Establish Lemma 1(ii) by direct calculation, avoiding the use of holomorphic functions. (Hint: the calculations are cleanest if one uses Wirtinger derivatives, as per Exercise 27 of Notes 1.)

Exercise 3 Let {\phi: U \rightarrow V} be a complex diffeomorphism between two open subsets {U,V} of {{\bf C}}, let {z_0} be a point in {U}, let {m} be a natural number, and let {f: V \rightarrow {\bf C} \cup \{\infty\}} be holomorphic. Show that {f: V \rightarrow {\bf C} \cup \{\infty\}} has a zero (resp. a pole) of order {m} at {\phi(z_0)} if and only if {f \circ \phi: U \rightarrow {\bf C} \cup \{\infty\}} has a zero (resp. a pole) of order {m} at {z_0}.

From Lemma 1(ii) we can now define the notion of a harmonic function {u: M \rightarrow {\bf R}} on a Riemann surface {M}; such a function {u} is harmonic if, for every coordinate chart {\phi_\alpha: U_\alpha \rightarrow V_\alpha} in some atlas, the map {u \circ \phi_\alpha^{-1}: V_\alpha \rightarrow {\bf R}} is harmonic. Lemma 1(ii) ensures that this definition of harmonicity does not depend on the choice of atlas. Similarly, using Exercise 3 one can define what it means for a holomorphic map {f: M \rightarrow {\bf C} \cup \{\infty\}} on a Riemann surface {M} to have a pole or zero of a given order at a point {p_0 \in M}, with the definition being independent of the choice of atlas.

In view of Lemma 1, it is thus natural to ask which Riemann surfaces are complex diffeomorphic to each other, and more generally to understand the space of holomorphic maps from one given Riemann surface to another. We will initially focus attention on three important model Riemann surfaces:

  • (i) (Elliptic model) The Riemann sphere {{\bf C} \cup \{\infty\}};
  • (ii) (Parabolic model) The complex plane {{\bf C}}; and
  • (iii) (Hyperbolic model) The unit disk {D(0,1)}.

The designation of these model Riemann surfaces as elliptic, parabolic, and hyperbolic comes from Riemannian geometry, where it is natural to endow each of these surfaces with a constant curvature Riemannian metric which is positive, zero, or negative in the elliptic, parabolic, and hyperbolic cases respectively. However, we will not discuss Riemannian geometry further here.

All three model Riemann surfaces are simply connected, but none of them are complex diffeomorphic to any other; indeed, there are no non-constant holomorphic maps from the Riemann sphere to the plane or the disk, nor are there any non-constant holomorphic maps from the plane to the disk (although there are plenty of holomorphic maps going in the opposite directions). The complex automorphisms (that is, the complex diffeomorphisms from a surface to itself) of each of the three surfaces can be classified explicitly. The automorphisms of the Riemann sphere turn out to be the Möbius transformations {z \mapsto \frac{az+b}{cz+d}} with {ad-bc \neq 0}, also known as fractional linear transformations. The automorphisms of the complex plane are the linear transformations {z \mapsto az+b} with {a \neq 0}, and the automorphisms of the disk are the fractional linear transformations of the form {z \mapsto e^{i\theta} \frac{\alpha - z}{1 - \overline{\alpha} z}} for {\theta \in {\bf R}} and {\alpha \in D(0,1)}. Holomorphic maps {f: D(0,1) \rightarrow D(0,1)} from the disk {D(0,1)} to itself that fix the origin obey a basic but incredibly important estimate known as the Schwarz lemma: they are “dominated” by the identity function {z \mapsto z} in the sense that {|f(z)| \leq |z|} for all {z \in D(0,1)}. Among other things, this lemma gives guidance to determine when a given Riemann surface is complex diffeomorphic to a disk; we shall discuss this point further below.

It is a beautiful and fundamental fact in complex analysis that these three model Riemann surfaces are in fact an exhaustive list of the simply connected Riemann surfaces, up to complex diffeomorphism. More precisely, we have the Riemann mapping theorem and the uniformisation theorem:

Theorem 4 (Riemann mapping theorem) Let {U} be a simply connected open subset of {{\bf C}} that is not all of {{\bf C}}. Then {U} is complex diffeomorphic to {D(0,1)}.

Theorem 5 (Uniformisation theorem) Let {M} be a simply connected Riemann surface. Then {M} is complex diffeomorphic to {{\bf C} \cup \{\infty\}}, {{\bf C}}, or {D(0,1)}.

As we shall see, every connected Riemann surface can be viewed as the quotient of its simply connected universal cover by a discrete group of automorphisms known as deck transformations. This in principle gives a complete classification of Riemann surfaces up to complex diffeomorphism, although the situation is still somewhat complicated in the hyperbolic case because of the wide variety of discrete groups of automorphisms available in that case.

We will prove the Riemann mapping theorem in these notes, using the elegant argument of Koebe that is based on the Schwarz lemma and Montel’s theorem (Exercise 57 of Notes 4). The uniformisation theorem is however more difficult to establish; we discuss some components of a proof (based on the Perron method of subharmonic functions) here, but stop short of providing a complete proof.

The above theorems show that it is in principle possible to conformally map various domains into model domains such as the unit disk, but the proofs of these theorems do not readily produce explicit conformal maps for this purpose. For some domains we can just write down a suitable such map. For instance:

Exercise 6 (Cayley transform) Let {{\bf H} := \{ z \in {\bf C}: \mathrm{Im} z > 0 \}} be the upper half-plane. Show that the Cayley transform {\phi: {\bf H} \rightarrow D(0,1)}, defined by

\displaystyle  \phi(z) := \frac{z-i}{z+i},

is a complex diffeomorphism from the upper half-plane {{\bf H}} to the disk {D(0,1)}, with inverse map {\phi^{-1}: D(0,1) \rightarrow {\bf H}} given by

\displaystyle  \phi^{-1}(w) := i \frac{1+w}{1-w}.

Exercise 7 Show that for any real numbers {a<b}, the strip {\{ z \in {\bf C}: a < \mathrm{Re}(z) < b \}} is complex diffeomorphic to the disk {D(0,1)}. (Hint: use the complex exponential and a linear transformation to map the strip onto the half-plane {{\bf H}}.)

Exercise 8 Show that for any real numbers {a<b<a+2\pi}, the strip {\{ re^{i\theta}: r>0, a < \theta < b \}} is complex diffeomorphic to the disk {D(0,1)}. (Hint: use a branch of either the complex logarithm, or of a complex power {z \mapsto z^\alpha}.)

We will discuss some other explicit conformal maps in this set of notes, such as the Schwarz-Christoffel maps that transform the upper half-plane {{\bf H}} to polygonal regions. Further examples of conformal mapping can be found in the text of Stein-Shakarchi.

Read the rest of this entry »

My colleague Tom Liggett recently posed to me the following problem about power series in one real variable {x}. Observe that the power series

\displaystyle  \sum_{n=0}^\infty (-1)^n\frac{x^n}{n!}

has very rapidly decaying coefficients (of order {O(1/n!)}), leading to an infinite radius of convergence; also, as the series converges to {e^{-x}}, the series decays very rapidly as {x} approaches {+\infty}. The problem is whether this is essentially the only example of this type. More precisely:

Problem 1 Let {a_0, a_1, \dots} be a bounded sequence of real numbers, and suppose that the power series

\displaystyle  f(x) := \sum_{n=0}^\infty a_n\frac{x^n}{n!}

(which has an infinite radius of convergence) decays like {O(e^{-x})} as {x \rightarrow +\infty}, in the sense that the function {e^x f(x)} remains bounded as {x \rightarrow +\infty}. Must the sequence {a_n} be of the form {a_n = C (-1)^n} for some constant {C}?

As it turns out, the problem has a very nice solution using complex analysis methods, which by coincidence I happen to be teaching right now. I am therefore posing as a challenge to my complex analysis students and to other readers of this blog to answer the above problem by complex methods; feel free to post solutions in the comments below (and in particular, if you don’t want to be spoiled, you should probably refrain from reading the comments). In fact, the only way I know how to solve this problem currently is by complex methods; I would be interested in seeing a purely real-variable solution that is not simply a thinly disguised version of a complex-variable argument.

(To be fair to my students, the complex variable argument does require one additional tool that is not directly covered in my notes. That tool can be found here.)

In the previous set of notes we saw that functions {f: U \rightarrow {\bf C}} that were holomorphic on an open set {U} enjoyed a large number of useful properties, particularly if the domain {U} was simply connected. In many situations, though, we need to consider functions {f} that are only holomorphic (or even well-defined) on most of a domain {U}, thus they are actually functions {f: U \backslash S \rightarrow {\bf C}} outside of some small singular set {S} inside {U}. (In this set of notes we only consider interior singularities; one can also discuss singular behaviour at the boundary of {U}, but this is a whole separate topic and will not be pursued here.) Since we have only defined the notion of holomorphicity on open sets, we will require the singular sets {S} to be closed, so that the domain {U \backslash S} on which {f} remains holomorphic is still open. A typical class of examples are the functions of the form {\frac{f(z)}{z-z_0}} that were already encountered in the Cauchy integral formula; if {f: U \rightarrow {\bf C}} is holomorphic and {z_0 \in U}, such a function would be holomorphic save for a singularity at {z_0}. Another basic class of examples are the rational functions {P(z)/Q(z)}, which are holomorphic outside of the zeroes of the denominator {Q}.

Singularities come in varying levels of “badness” in complex analysis. The least harmful type of singularity is the removable singularity – a point {z_0} which is an isolated singularity (i.e., an isolated point of the singular set {S}) where the function {f} is undefined, but for which one can extend the function across the singularity in such a fashion that the function becomes holomorphic in a neighbourhood of the singularity. A typical example is that of the complex sinc function {\frac{\sin(z)}{z}}, which has a removable singularity at the origin {0}, which can be removed by declaring the sinc function to equal {1} at {0}. The detection of isolated removable singularities can be accomplished by Riemann’s theorem on removable singularities (Exercise 35 from Notes 3): if a holomorphic function {f: U \backslash S \rightarrow {\bf C}} is bounded near an isolated singularity {z_0 \in S}, then the singularity at {z_0} may be removed.

After removable singularities, the mildest form of singularity one can encounter is that of a pole – an isolated singularity {z_0} such that {f(z)} can be factored as {f(z) = \frac{g(z)}{(z-z_0)^m}} for some {m \geq 1} (known as the order of the pole), where {g} has a removable singularity at {z_0} (and is non-zero at {z_0} once the singularity is removed). Such functions have already made a frequent appearance in previous notes, particularly the case of simple poles when {m=1}. The behaviour near {z_0} of function {f} with a pole of order {m} is well understood: for instance, {|f(z)|} goes to infinity as {z} approaches {z_0} (at a rate comparable to {|z-z_0|^{-m}}). These singularities are not, strictly speaking, removable; but if one compactifies the range {{\bf C}} of the holomorphic function {f: U \backslash S \rightarrow {\bf C}} to a slightly larger space {{\bf C} \cup \{\infty\}} known as the Riemann sphere, then the singularity can be removed. In particular, functions {f: U \backslash S \rightarrow {\bf C}} which only have isolated singularities that are either poles or removable can be extended to holomorphic functions {f: U \rightarrow {\bf C} \cup \{\infty\}} to the Riemann sphere. Such functions are known as meromorphic functions, and are nearly as well-behaved as holomorphic functions in many ways. In fact, in one key respect, the family of meromorphic functions is better: the meromorphic functions on {U} turn out to form a field, in particular the quotient of two meromorphic functions is again meromorphic (if the denominator is not identically zero).

Unfortunately, there are isolated singularities that are neither removable or poles, and are known as essential singularities. A typical example is the function {f(z) = e^{1/z}}, which turns out to have an essential singularity at {z=0}. The behaviour of such essential singularities is quite wild; we will show here the Casorati-Weierstrass theorem, which shows that the image of {f} near the essential singularity is dense in the complex plane, as well as the more difficult great Picard theorem which asserts that in fact the image can omit at most one point in the complex plane. Nevertheless, around any isolated singularity (even the essential ones) {z_0}, it is possible to expand {f} as a variant of a Taylor series known as a Laurent series {\sum_{n=-\infty}^\infty a_n (z-z_0)^n}. The {\frac{1}{z-z_0}} coefficient {a_{-1}} of this series is particularly important for contour integration purposes, and is known as the residue of {f} at the isolated singularity {z_0}. These residues play a central role in a common generalisation of Cauchy’s theorem and the Cauchy integral formula known as the residue theorem, which is a particularly useful tool for computing (or at least transforming) contour integrals of meromorphic functions, and has proven to be a particularly popular technique to use in analytic number theory. Within complex analysis, one important consequence of the residue theorem is the argument principle, which gives a topological (and analytical) way to control the zeroes and poles of a meromorphic function.

Finally, there are the non-isolated singularities. Little can be said about these singularities in general (for instance, the residue theorem does not directly apply in the presence of such singularities), but certain types of non-isolated singularities are still relatively easy to understand. One particularly common example of such non-isolated singularity arises when trying to invert a non-injective function, such as the complex exponential {z \mapsto \exp(z)} or a power function {z \mapsto z^n}, leading to branches of multivalued functions such as the complex logarithm {z \mapsto \log(z)} or the {n^{th}} root function {z \mapsto z^{1/n}} respectively. Such branches will typically have a non-isolated singularity along a branch cut; this branch cut can be moved around the complex domain by switching from one branch to another, but usually cannot be eliminated entirely, unless one is willing to lift up the domain {U} to a more general type of domain known as a Riemann surface. As such, one can view branch cuts as being an “artificial” form of singularity, being an artefact of a choice of local coordinates of a Riemann surface, rather than reflecting any intrinsic singularity of the function itself. The further study of Riemann surfaces is an important topic in complex analysis (as well as the related fields of complex geometry and algebraic geometry), but unfortunately this topic will probably be postponed to the next course in this sequence (which I will not be teaching).

Read the rest of this entry »

TWe now come to perhaps the most central theorem in complex analysis (save possibly for the fundamental theorem of calculus), namely Cauchy’s theorem, which allows one to compute (or at least transform) a large number of contour integrals {\int_\gamma f(z)\ dz} even without knowing any explicit antiderivative of {f}. There are many forms and variants of Cauchy’s theorem. To give one such version, we need the basic topological notion of a homotopy:

Definition 1 (Homotopy) Let {U} be an open subset of {{\bf C}}, and let {\gamma_0: [a,b] \rightarrow U}, {\gamma_1: [a,b] \rightarrow U} be two curves in {U}.

  • (i) If {\gamma_0, \gamma_1} have the same initial point {z_0} and final point {z_1}, we say that {\gamma_0} and {\gamma_1} are homotopic with fixed endpoints in {U} if there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}, and such that {\gamma(s,a) = z_0} and {\gamma(s,b) = z_1} for all {s \in [0,1]}.
  • (ii) If {\gamma_0, \gamma_1} are closed (but possibly with different initial points), we say that {\gamma_0} and {\gamma_1} are homotopic as closed curves in {U} if there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}, and such that {\gamma(s,a) = \gamma(s,b)} for all {s \in [0,1]}.
  • (iii) If {\gamma_2: [c,d] \rightarrow U} and {\gamma_3: [e,f] \rightarrow U} are curves with the same initial point and same final point, we say that {\gamma_2} and {\gamma_3} are homotopic with fixed endpoints up to reparameterisation in {U} if there is a reparameterisation {\tilde \gamma_2: [a,b] \rightarrow U} of {\gamma_2} which is homotopic with fixed endpoints in {U} to a reparameterisation {\tilde \gamma_3: [a,b] \rightarrow U} of {\gamma_3}.
  • (iv) If {\gamma_2: [c,d] \rightarrow U} and {\gamma_3: [e,f] \rightarrow U} are closed curves, we say that {\gamma_2} and {\gamma_3} are homotopic as closed curves up to reparameterisation in {U} if there is a reparameterisation {\tilde \gamma_2: [a,b] \rightarrow U} of {\gamma_2} which is homotopic as closed curves in {U} to a reparameterisation {\tilde \gamma_3: [a,b] \rightarrow U} of {\gamma_3}.

In the first two cases, the map {\gamma} will be referred to as a homotopy from {\gamma_0} to {\gamma_1}, and we will also say that {\gamma_0} can be continously deformed to {\gamma_1} (either with fixed endpoints, or as closed curves).

Example 2 If {U} is a convex set, that is to say that {(1-s) z_0 + s z_1 \in U} whenever {z_0,z_1 \in U} and {0 \leq s \leq 1}, then any two curves {\gamma_0, \gamma_1: [0,1] \rightarrow U} from one point {z_0} to another {z_1} are homotopic, by using the homotopy

\displaystyle \gamma(s,t) := (1-s) \gamma_0(t) + s \gamma_1(t).

For a similar reason, in a convex open set {U}, any two closed curves will be homotopic to each other as closed curves.

Exercise 3 Let {U} be an open subset of {{\bf C}}.

  • (i) Prove that the property of being homotopic with fixed endpoints in {U} is an equivalence relation.
  • (ii) Prove that the property of being homotopic as closed curves in {U} is an equivalence relation.
  • (iii) If {\gamma_0, \gamma_1: [a,b] \rightarrow U} are closed curves with the same initial point, show that {\gamma_0} is homotopic to {\gamma_1} as closed curves if and only if {\gamma_0} is homotopic to {\gamma_2 + \gamma_1 + (-\gamma_2)} with fixed endpoints for some closed curve {\gamma_2} with the same initial point as {\gamma_0} or {\gamma_1}.
  • (iv) Define a point in {U} to be a curve {\gamma_1: [a,b] \rightarrow U} of the form {\gamma_1(t) = z_0} for some {z_0 \in U} and all {t \in [a,b]}. Let {\gamma_0: [a,b] \rightarrow U} be a closed curve in {U}. Show that {\gamma_0} is homotopic with fixed endpoints to a point in {U} if and only if {\gamma_0} is homotopic as a closed curve to a point in {U}. (In either case, we will call {\gamma_0} homotopic to a point, null-homotopic, or contractible to a point in {U}.)
  • (v) If {\gamma_0, \gamma_1: [a,b] \rightarrow U} are curves with the same initial point and the same terminal point, show that {\gamma_0} is homotopic to {\gamma_1} with fixed endpoints in {U} if and only if {\gamma_0 + (-\gamma_1)} is homotopic to a point in {U}.
  • (vi) If {U} is connected, and {\gamma_0, \gamma_1: [a,b] \rightarrow U} are any two curves in {U}, show that there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}. Thus the notion of homotopy becomes rather trivial if one does not fix the endpoints or require the curve to be closed.
  • (vii) Show that if {\gamma_1: [a,b] \rightarrow U} is a reparameterisation of {\gamma_0: [a,b] \rightarrow U}, then {\gamma_0} and {\gamma_1} are homotopic with fixed endpoints in U.
  • (viii) Prove that the property of being homotopic with fixed endpoints in {U} up to reparameterisation is an equivalence relation.
  • (ix) Prove that the property of being homotopic as closed curves in {U} up to reparameterisation is an equivalence relation.

We can then phrase Cauchy’s theorem as an assertion that contour integration on holomorphic functions is a homotopy invariant. More precisely:

Theorem 4 (Cauchy’s theorem) Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic.

  • (i) If {\gamma_0: [a,b] \rightarrow U} and {\gamma_1: [c,d] \rightarrow U} are rectifiable curves that are homotopic in {U} with fixed endpoints up to reparameterisation, then

    \displaystyle \int_{\gamma_0} f(z)\ dz = \int_{\gamma_1} f(z)\ dz.

  • (ii) If {\gamma_0: [a,b] \rightarrow U} and {\gamma_1: [c,d] \rightarrow U} are closed rectifiable curves that are homotopic in {U} as closed curves up to reparameterisation, then

    \displaystyle \int_{\gamma_0} f(z)\ dz = \int_{\gamma_1} f(z)\ dz.

This version of Cauchy’s theorem is particularly useful for applications, as it explicitly brings into play the powerful technique of contour shifting, which allows one to compute a contour integral by replacing the contour with a homotopic contour on which the integral is easier to either compute or integrate. This formulation of Cauchy’s theorem also highlights the close relationship between contour integrals and the algebraic topology of the complex plane (and open subsets {U} thereof). Setting {\gamma_1} to be a point, we obtain an important special case of Cauchy’s theorem (which is in fact equivalent to the full theorem):

Corollary 5 (Cauchy’s theorem, again) Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic. Then for any closed rectifiable curve {\gamma} in {U} that is contractible in {U} to a point, one has {\int_\gamma f(z)\ dz = 0}.

Exercise 6 Show that Theorem 4 and Corollary 5 are logically equivalent.

An important feature to note about Cauchy’s theorem is the global nature of its hypothesis on {f}. The conclusion of Cauchy’s theorem only involves the values of a function {f} on the images of the two curves {\gamma_0, \gamma_1}. However, in order for the hypotheses of Cauchy’s theorem to apply, the function {f} must be holomorphic not only on the images on {\gamma_0, \gamma_1}, but on an open set {U} that is large enough (and sufficiently free of “holes”) to support a homotopy between the two curves. This point can be emphasised through the following fundamental near-counterexample to Cauchy’s theorem:

Example 7 Let {U := {\bf C} \backslash \{0\}}, and let {f: U \rightarrow {\bf C}} be the holomorphic function {f(z) := \frac{1}{z}}. Let {\gamma_{0,1,\circlearrowleft}: [0,2\pi] \rightarrow {\bf C}} be the closed unit circle contour {\gamma_{0,1,\circlearrowleft}(t) := e^{it}}. Direct calculation shows that

\displaystyle \int_{\gamma_{0,1,\circlearrowleft}} f(z)\ dz = 2\pi i \neq 0.

As a consequence of this and Cauchy’s theorem, we conclude that the contour {\gamma_{0,1,\circlearrowleft}} is not contractible to a point in {U}; note that this does not contradict Example 2 because {U} is not convex. Thus we see that the lack of holomorphicity (or singularity) of {f} at the origin can be “blamed” for the non-vanishing of the integral of {f} on the closed contour {\gamma_{0,1,\circlearrowleft}}, even though this contour does not come anywhere near the origin. Thus we see that the global behaviour of {f}, not just the behaviour in the local neighbourhood of {\gamma_{0,1,\circlearrowleft}}, has an impact on the contour integral.

One can of course rewrite this example to involve non-closed contours instead of closed ones. For instance, if we let {\gamma_0, \gamma_1: [0,\pi] \rightarrow U} denote the half-circle contours {\gamma_0(t) := e^{it}} and {\gamma_1(t) := e^{-it}}, then {\gamma_0,\gamma_1} are both contours in {U} from {+1} to {-1}, but one has

\displaystyle \int_{\gamma_0} f(z)\ dz = +\pi i

whereas

\displaystyle \int_{\gamma_1} f(z)\ dz = -\pi i.

In order for this to be consistent with Cauchy’s theorem, we conclude that {\gamma_0} and {\gamma_1} are not homotopic in {U} (even after reparameterisation).

In the specific case of functions of the form {\frac{1}{z}}, or more generally {\frac{f(z)}{z-z_0}} for some point {z_0} and some {f} that is holomorphic in some neighbourhood of {z_0}, we can quantify the precise failure of Cauchy’s theorem through the Cauchy integral formula, and through the concept of a winding number. These turn out to be extremely powerful tools for understanding both the nature of holomorphic functions and the topology of open subsets of the complex plane, as we shall see in this and later notes.

Read the rest of this entry »

Having discussed differentiation of complex mappings in the preceding notes, we now turn to the integration of complex maps. We first briefly review the situation of integration of (suitably regular) real functions {f: [a,b] \rightarrow {\bf R}} of one variable. Actually there are three closely related concepts of integration that arise in this setting:

  • (i) The signed definite integral {\int_a^b f(x)\ dx}, which is usually interpreted as the Riemann integral (or equivalently, the Darboux integral), which can be defined as the limit (if it exists) of the Riemann sums

    \displaystyle  \sum_{j=1}^n f(x_j^*) (x_j - x_{j-1}) \ \ \ \ \ (1)

    where {a = x_0 < x_1 < \dots < x_n = b} is some partition of {[a,b]}, {x_j^*} is an element of the interval {[x_{j-1},x_j]}, and the limit is taken as the maximum mesh size {\max_{1 \leq j \leq n} |x_j - x_{j-1}|} goes to zero. It is convenient to adopt the convention that {\int_b^a f(x)\ dx := - \int_a^b f(x)\ dx} for {a < b}; alternatively one can interpret {\int_b^a f(x)\ dx} as the limit of the Riemann sums (1), where now the (reversed) partition {b = x_0 > x_1 > \dots > x_n = a} goes leftwards from {b} to {a}, rather than rightwards from {a} to {b}.

  • (ii) The unsigned definite integral {\int_{[a,b]} f(x)\ dx}, usually interpreted as the Lebesgue integral. The precise definition of this integral is a little complicated (see e.g. this previous post), but roughly speaking the idea is to approximate {f} by simple functions {\sum_{i=1}^n c_i 1_{E_i}} for some coefficients {c_i \in {\bf R}} and sets {E_i \subset [a,b]}, and then approximate the integral {\int_{[a,b]} f(x)\ dx} by the quantities {\sum_{i=1}^n c_i m(E_i)}, where {E_i} is the Lebesgue measure of {E_i}. In contrast to the signed definite integral, no orientation is imposed or used on the underlying domain of integration, which is viewed as an “undirected” set {[a,b]}.
  • (iii) The indefinite integral or antiderivative {\int f(x)\ dx}, defined as any function {F: [a,b] \rightarrow {\bf R}} whose derivative {F'} exists and is equal to {f} on {[a,b]}. Famously, the antiderivative is only defined up to the addition of an arbitrary constant {C}, thus for instance {\int x\ dx = \frac{1}{2} x^2 + C}.

There are some other variants of the above integrals (e.g. the Henstock-Kurzweil integral, discussed for instance in this previous post), which can handle slightly different classes of functions and have slightly different properties than the standard integrals listed here, but we will not need to discuss such alternative integrals in this course (with the exception of some improper and principal value integrals, which we will encounter in later notes).

The above three notions of integration are closely related to each other. For instance, if {f: [a,b] \rightarrow {\bf R}} is a Riemann integrable function, then the signed definite integral and unsigned definite integral coincide (when the former is oriented correctly), thus

\displaystyle  \int_a^b f(x)\ dx = \int_{[a,b]} f(x)\ dx

and

\displaystyle  \int_b^a f(x)\ dx = -\int_{[a,b]} f(x)\ dx

If {f: [a,b] \rightarrow {\bf R}} is continuous, then by the fundamental theorem of calculus, it possesses an antiderivative {F = \int f(x)\ dx}, which is well defined up to an additive constant {C}, and

\displaystyle  \int_c^d f(x)\ dx = F(d) - F(c)

for any {c,d \in [a,b]}, thus for instance {\int_a^b F(x)\ dx = F(b) - F(a)} and {\int_b^a F(x)\ dx = F(a) - F(b)}.

All three of the above integration concepts have analogues in complex analysis. By far the most important notion will be the complex analogue of the signed definite integral, namely the contour integral {\int_\gamma f(z)\ dz}, in which the directed line segment from one real number {a} to another {b} is now replaced by a type of curve in the complex plane known as a contour. The contour integral can be viewed as the special case of the more general line integral {\int_\gamma f(z) dx + g(z) dy}, that is of particular relevance in complex analysis. There are also analogues of the Lebesgue integral, namely the arclength measure integrals {\int_\gamma f(z)\ |dz|} and the area integrals {\int_\Omega f(x+iy)\ dx dy}, but these play only an auxiliary role in the subject. Finally, we still have the notion of an antiderivative {F(z)} (also known as a primitive) of a complex function {f(z)}.

As it turns out, the fundamental theorem of calculus continues to hold in the complex plane: under suitable regularity assumptions on a complex function {f} and a primitive {F} of that function, one has

\displaystyle  \int_\gamma f(z)\ dz = F(z_1) - F(z_0)

whenever {\gamma} is a contour from {z_0} to {z_1} that lies in the domain of {f}. In particular, functions {f} that possess a primitive must be conservative in the sense that {\int_\gamma f(z)\ dz = 0} for any closed contour. This property of being conservative is not typical, in that “most” functions {f} will not be conservative. However, there is a remarkable and far-reaching theorem, the Cauchy integral theorem (also known as the Cauchy-Goursat theorem), which asserts that any holomorphic function is conservative, so long as the domain is simply connected (or if one restricts attention to contractible closed contours). We will explore this theorem and several of its consequences the next set of notes.

Read the rest of this entry »

Archives