You are currently browsing the monthly archive for November 2020.

Consider a disk {D(z_0,r) := \{ z: |z-z_0| < r \}} in the complex plane. If one applies an affine-linear map {f(z) = az+b} to this disk, one obtains

\displaystyle  f(D(z_0,r)) = D(f(z_0), |f'(z_0)| r).

For maps that are merely holomorphic instead of affine-linear, one has some variants of this assertion, which I am recording here mostly for my own reference:

Theorem 1 (Holomorphic images of disks) Let {D(z_0,r)} be a disk in the complex plane, and {f: D(z_0,r) \rightarrow {\bf C}} be a holomorphic function with {f'(z_0) \neq 0}.
  • (i) (Open mapping theorem or inverse function theorem) {f(D(z_0,r))} contains a disk {D(f(z_0),\varepsilon)} for some {\varepsilon>0}. (In fact there is even a holomorphic right inverse of {f} from {D(f(z_0), \varepsilon)} to {D(z_0,r)}.)
  • (ii) (Bloch theorem) {f(D(z_0,r))} contains a disk {D(w, c |f'(z_0)| r)} for some absolute constant {c>0} and some {w \in {\bf C}}. (In fact there is even a holomorphic right inverse of {f} from {D(w, c |f'(z_0)| r)} to {D(z_0,r)}.)
  • (iii) (Koebe quarter theorem) If {f} is injective, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{1}{4} |f'(z_0)| r)}.
  • (iv) If {f} is a polynomial of degree {n}, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{1}{n} |f'(z_0)| r)}.
  • (v) If one has a bound of the form {|f'(z)| \leq A |f'(z_0)|} for all {z \in D(z_0,r)} and some {A>1}, then {f(D(z_0,r))} contains the disk {D(f(z_0), \frac{c}{A} |f'(z_0)| r)} for some absolute constant {c>0}. (In fact there is holomorphic right inverse of {f} from {D(f(z_0), \frac{c}{A} |f'(z_0)| r)} to {D(z_0,r)}.)

Parts (i), (ii), (iii) of this theorem are standard, as indicated by the given links. I found part (iv) as (a consequence of) Theorem 2 of this paper of Degot, who remarks that it “seems not already known in spite of its simplicity”; an equivalent form of this result also appears in Lemma 4 of this paper of Miller. The proof is simple:

Proof: (Proof of (iv)) Let {w \in D(f(z_0), \frac{1}{n} |f'(z_0)| r)}, then we have a lower bound for the log-derivative of {f(z)-w} at {z_0}:

\displaystyle  \frac{|f'(z_0)|}{|f(z_0)-w|} > \frac{n}{r}

(with the convention that the left-hand side is infinite when {f(z_0)=w}). But by the fundamental theorem of algebra we have

\displaystyle  \frac{f'(z_0)}{f(z_0)-w} = \sum_{j=1}^n \frac{1}{z_0-\zeta_j}

where {\zeta_1,\dots,\zeta_n} are the roots of the polynomial {f(z)-w} (counting multiplicity). By the pigeonhole principle, there must therefore exist a root {\zeta_j} of {f(z) - w} such that

\displaystyle  \frac{1}{|z_0-\zeta_j|} > \frac{1}{r}

and hence {\zeta_j \in D(z_0,r)}. Thus {f(D(z_0,r))} contains {w}, and the claim follows. \Box

The constant {\frac{1}{n}} in (iv) is completely sharp: if {f(z) = z^n} and {z_0} is non-zero then {f(D(z_0,|z_0|))} contains the disk

\displaystyle D(f(z_0), \frac{1}{n} |f'(z_0)| r) = D( z_0^n, |z_0|^n)

but avoids the origin, thus does not contain any disk of the form {D( z_0^n, |z_0|^n+\varepsilon)}. This example also shows that despite parts (ii), (iii) of the theorem, one cannot hope for a general inclusion of the form

\displaystyle  f(D(z_0,r)) \supset D(f(z_0), c |f'(z_0)| r )

for an absolute constant {c>0}.

Part (v) is implicit in the standard proof of Bloch’s theorem (part (ii)), and is easy to establish:

Proof: (Proof of (v)) From the Cauchy inequalities one has {f''(z) = O(\frac{A}{r} |f'(z_0)|)} for {z \in D(z_0,r/2)}, hence by Taylor’s theorem with remainder {f(z) = f(z_0) + f'(z_0) (z-z_0) (1 + O( A \frac{|z-z_0|}{r} ) )} for {z \in D(z_0, r/2)}. By Rouche’s theorem, this implies that the function {f(z)-w} has a unique zero in {D(z_0, 2cr/A)} for any {w \in D(f(z_0), cr|f'(z_0)|/A)}, if {c>0} is a sufficiently small absolute constant. The claim follows. \Box

Note that part (v) implies part (i). A standard point picking argument also lets one deduce part (ii) from part (v):

Proof: (Proof of (ii)) By shrinking {r} slightly if necessary we may assume that {f} extends analytically to the closure of the disk {D(z_0,r)}. Let {c} be the constant in (v) with {A=2}; we will prove (iii) with {c} replaced by {c/2}. If we have {|f'(z)| \leq 2 |f'(z_0)|} for all {z \in D(z_0,r/2)} then we are done by (v), so we may assume without loss of generality that there is {z_1 \in D(z_0,r/2)} such that {|f'(z_1)| > 2 |f'(z_0)|}. If {|f'(z)| \leq 2 |f'(z_1)|} for all {z \in D(z_1,r/4)} then by (v) we have

\displaystyle  f( D(z_0, r) ) \supset f( D(z_1,r/2) ) \supset D( f(z_1), \frac{c}{2} |f'(z_1)| \frac{r}{2} )

\displaystyle \supset D( f(z_1), \frac{c}{2} |f'(z_0)| r )

and we are again done. Hence we may assume without loss of generality that there is {z_2 \in D(z_1,r/4)} such that {|f'(z_2)| > 2 |f'(z_1)|}. Iterating this procedure in the obvious fashion we either are done, or obtain a Cauchy sequence {z_0, z_1, \dots} in {D(z_0,r)} such that {f'(z_j)} goes to infinity as {j \rightarrow \infty}, which contradicts the analytic nature of {f} (and hence continuous nature of {f'}) on the closure of {D(z_0,r)}. This gives the claim. \Box

Here is another classical result stated by Alexander (and then proven by Kakeya and by Szego, but also implied to a classical theorem of Grace and Heawood) that is broadly compatible with parts (iii), (iv) of the above theorem:

Proposition 2 Let {D(z_0,r)} be a disk in the complex plane, and {f: D(z_0,r) \rightarrow {\bf C}} be a polynomial of degree {n \geq 1} with {f'(z) \neq 0} for all {z \in D(z_0,r)}. Then {f} is injective on {D(z_0, \sin\frac{\pi}{n})}.

The radius {\sin \frac{\pi}{n}} is best possible, for the polynomial {f(z) = z^n} has {f'} non-vanishing on {D(1,1)}, but one has {f(\cos(\pi/n) e^{i \pi/n}) = f(\cos(\pi/n) e^{-i\pi/n})}, and {\cos(\pi/n) e^{i \pi/n}, \cos(\pi/n) e^{-i\pi/n}} lie on the boundary of {D(1,\sin \frac{\pi}{n})}.

If one narrows {\sin \frac{\pi}{n}} slightly to {\sin \frac{\pi}{2n}} then one can quickly prove this proposition as follows. Suppose for contradiction that there exist distinct {z_1, z_2 \in D(z_0, \sin\frac{\pi}{n})} with {f(z_1)=f(z_2)}, thus if we let {\gamma} be the line segment contour from {z_1} to {z_2} then {\int_\gamma f'(z)\ dz}. However, by assumption we may factor {f'(z) = c (z-\zeta_1) \dots (z-\zeta_{n-1})} where all the {\zeta_j} lie outside of {D(z_0,r)}. Elementary trigonometry then tells us that the argument of {z-\zeta_j} only varies by less than {\frac{\pi}{n}} as {z} traverses {\gamma}, hence the argument of {f'(z)} only varies by less than {\pi}. Thus {f'(z)} takes values in an open half-plane avoiding the origin and so it is not possible for {\int_\gamma f'(z)\ dz} to vanish.

To recover the best constant of {\sin \frac{\pi}{n}} requires some effort. By taking contrapositives and applying an affine rescaling and some trigonometry, the proposition can be deduced from the following result, known variously as the Grace-Heawood theorem or the complex Rolle theorem.

Proposition 3 (Grace-Heawood theorem) Let {f: {\bf C} \rightarrow {\bf C}} be a polynomial of degree {n \geq 1} such that {f(1)=f(-1)}. Then {f'} contains a zero in the closure of {D( 0, \cot \frac{\pi}{n} )}.

This is in turn implied by a remarkable and powerful theorem of Grace (which we shall prove shortly). Given two polynomials {f,g} of degree at most {n}, define the apolar form {(f,g)_n} by

\displaystyle  (f,g)_n := \sum_{k=0}^n (-1)^k f^{(k)}(0) g^{(n-k)}(0). \ \ \ \ \ (1)

Theorem 4 (Grace’s theorem) Let {C} be a circle or line in {{\bf C}}, dividing {{\bf C} \backslash C} into two open connected regions {\Omega_1, \Omega_2}. Let {f,g} be two polynomials of degree at most {n \geq 1}, with all the zeroes of {f} lying in {\Omega_1} and all the zeroes of {g} lying in {\Omega_2}. Then {(f,g)_n \neq 0}.

(Contrapositively: if {(f,g)_n=0}, then the zeroes of {f} cannot be separated from the zeroes of {g} by a circle or line.)

Indeed, a brief calculation reveals the identity

\displaystyle  f(1) - f(-1) = (f', g)_{n-1}

where {g} is the degree {n-1} polynomial

\displaystyle  g(z) := \frac{1}{n!} ((z+1)^n - (z-1)^n).

The zeroes of {g} are {i \cot \frac{\pi j}{n}} for {j=1,\dots,n-1}, so the Grace-Heawood theorem follows by applying Grace’s theorem with {C} equal to the boundary of {D(0, \cot \frac{\pi}{n})}.

The same method of proof gives the following nice consequence:

Theorem 5 (Perpendicular bisector theorem) Let {f: {\bf C} \rightarrow C} be a polynomial such that {f(z_1)=f(z_2)} for some distinct {z_1,z_2}. Then the zeroes of {f'} cannot all lie on one side of the perpendicular bisector of {z_1,z_2}. For instance, if {f(1)=f(-1)}, then the zeroes of {f'} cannot all lie in the halfplane {\{ z: \mathrm{Re} z > 0 \}} or the halfplane {\{ z: \mathrm{Re} z < 0 \}}.

I’d be interested in seeing a proof of this latter theorem that did not proceed via Grace’s theorem.

Now we give a proof of Grace’s theorem. The case {n=1} can be established by direct computation, so suppose inductively that {n>1} and that the claim has already been established for {n-1}. Given the involvement of circles and lines it is natural to suspect that a Möbius transformation symmetry is involved. This is indeed the case and can be made precise as follows. Let {V_n} denote the vector space of polynomials {f} of degree at most {n}, then the apolar form is a bilinear form {(,)_n: V_n \times V_n \rightarrow {\bf C}}. Each translation {z \mapsto z+a} on the complex plane induces a corresponding map on {V_n}, mapping each polynomial {f} to its shift {\tau_a f(z) := f(z-a)}. We claim that the apolar form is invariant with respect to these translations:

\displaystyle  ( \tau_a f, \tau_a g )_n = (f,g)_n.

Taking derivatives in {a}, it suffices to establish the skew-adjointness relation

\displaystyle  (f', g)_n + (f,g')_n = 0

but this is clear from the alternating form of (1).

Next, we see that the inversion map {z \mapsto 1/z} also induces a corresponding map on {V_n}, mapping each polynomial {f \in V_n} to its inversion {\iota f(z) := z^n f(1/z)}. From (1) we see that this map also (projectively) preserves the apolar form:

\displaystyle  (\iota f, \iota g)_n = (-1)^n (f,g)_n.

More generally, the group of Möbius transformations on the Riemann sphere acts projectively on {V_n}, with each Möbius transformation {T: {\bf C} \rightarrow {\bf C}} mapping each {f \in V_n} to {Tf(z) := g_T(z) f(T^{-1} z)}, where {g_T} is the unique (up to constants) rational function that maps this a map from {V_n} to {V_n} (its divisor is {n(T \infty) - n(\infty)}). Since the Möbius transformations are generated by translations and inversion, we see that the action of Möbius transformations projectively preserves the apolar form; also, we see this action of {T} on {V_n} also moves the zeroes of each {f \in V_n} by {T} (viewing polynomials of degree less than {n} in {V_n} as having zeroes at infinity). In particular, the hypotheses and conclusions of Grace’s theorem are preserved by this Möbius action. We can then apply such a transformation to move one of the zeroes of {f} to infinity (thus making {f} a polynomial of degree {n-1}), so that {C} must now be a circle, with the zeroes of {g} inside the circle and the remaining zeroes of {f} outside the circle. But then

\displaystyle  (f,g)_n = (f, g')_{n-1}.

By the Gauss-Lucas theorem, the zeroes of {g'} are also inside {C}. The claim now follows from the induction hypothesis.

Ben Green and I have updated our paper “An arithmetic regularity lemma, an associated counting lemma, and applications” to account for a somewhat serious issue with the paper that was pointed out to us recently by Daniel Altman. This paper contains two core theorems:

  • An “arithmetic regularity lemma” that, roughly speaking, decomposes an arbitrary bounded sequence {f(n)} on an interval {\{1,\dots,N\}} as an “irrational nilsequence” {F(g(n) \Gamma)} of controlled complexity, plus some “negligible” errors (where one uses the Gowers uniformity norm as the main norm to control the neglibility of the error); and
  • An “arithmetic counting lemma” that gives an asymptotic formula for counting various averages {{\mathbb E}_{{\bf n} \in {\bf Z}^d \cap P} f(\psi_1({\bf n})) \dots f(\psi_t({\bf n}))} for various affine-linear forms {\psi_1,\dots,\psi_t} when the functions {f} are given by irrational nilsequences.

The combination of the two theorems is then used to address various questions in additive combinatorics.

There are no direct issues with the arithmetic regularity lemma. However, it turns out that the arithmetic counting lemma is only true if one imposes an additional property (which we call the “flag property”) on the affine-linear forms {\psi_1,\dots,\psi_t}. Without this property, there does not appear to be a clean asymptotic formula for these averages if the only hypothesis one places on the underlying nilsequences is irrationality. Thus when trying to understand the asymptotics of averages involving linear forms that do not obey the flag property, the paradigm of understanding these averages via a combination of the regularity lemma and a counting lemma seems to require some significant revision (in particular, one would probably have to replace the existing regularity lemma with some variant, despite the fact that the lemma is still technically true in this setting). Fortunately, for most applications studied to date (including the important subclass of translation-invariant affine forms), the flag property holds; however our claim in the paper to have resolved a conjecture of Gowers and Wolf on the true complexity of systems of affine forms must now be narrowed, as our methods only verify this conjecture under the assumption of the flag property.

In a bit more detail: the asymptotic formula for our counting lemma involved some finite-dimensional vector spaces {\Psi^{[i]}} for various natural numbers {i}, defined as the linear span of the vectors {(\psi^i_1({\bf n}), \dots, \psi^i_t({\bf n}))} as {{\bf n}} ranges over the parameter space {{\bf Z}^d}. Roughly speaking, these spaces encode some constraints one would expect to see amongst the forms {\psi^i_1({\bf n}), \dots, \psi^i_t({\bf n})}. For instance, in the case of length four arithmetic progressions when {d=2}, {{\bf n} = (n,r)}, and

\displaystyle  \psi_i({\bf n}) = n + (i-1)r

for {i=1,2,3,4}, then {\Psi^{[1]}} is spanned by the vectors {(1,1,1,1)} and {(1,2,3,4)} and can thus be described as the two-dimensional linear space

\displaystyle  \Psi^{[1]} = \{ (a,b,c,d): a-2b+c = b-2c+d = 0\} \ \ \ \ \ (1)

while {\Psi^{[2]}} is spanned by the vectors {(1,1,1,1)}, {(1,2,3,4)}, {(1^2,2^2,3^2,4^2)} and can be described as the hyperplane

\displaystyle  \Psi^{[2]} = \{ (a,b,c,d): a-3b+3c-d = 0 \}. \ \ \ \ \ (2)

As a special case of the counting lemma, we can check that if {f} takes the form {f(n) = F( \alpha n, \beta n^2 + \gamma n)} for some irrational {\alpha,\beta \in {\bf R}/{\bf Z}}, some arbitrary {\gamma \in {\bf R}/{\bf Z}}, and some smooth {F: {\bf R}/{\bf Z} \times {\bf R}/{\bf Z} \rightarrow {\bf C}}, then the limiting value of the average

\displaystyle  {\bf E}_{n, r \in [N]} f(n) f(n+r) f(n+2r) f(n+3r)

as {N \rightarrow \infty} is equal to

\displaystyle  \int_{a_1,b_1,c_1,d_1 \in {\bf R}/{\bf Z}: a_1-2b_1+c_1=b_1-2c_1+d_1=0} \int_{a_2,b_2,c_2,d_2 \in {\bf R}/{\bf Z}: a_2-3b_2+3c_2-d_2=0}

\displaystyle  F(a_1,a_2) F(b_1,b_2) F(c_1,c_2) F(d_1,d_2)

which reflects the constraints

\displaystyle  \alpha n - 2 \alpha(n+r) + \alpha(n+2r) = \alpha(n+r) - 2\alpha(n+2r)+\alpha(n+3r)=0


\displaystyle  (\beta n^2 + \gamma n) - 3 (\beta(n+r)^2+\gamma(n+r))

\displaystyle + 3 (\beta(n+2r)^2 +\gamma(n+2r)) - (\beta(n+3r)^2+\gamma(n+3r))=0.

These constraints follow from the descriptions (1), (2), using the containment {\Psi^{[1]} \subset \Psi^{[2]}} to dispense with the lower order term {\gamma n} (which then plays no further role in the analysis).

The arguments in our paper turn out to be perfectly correct under the assumption of the “flag property” that {\Psi^{[i]} \subset \Psi^{[i+1]}} for all {i}. The problem is that the flag property turns out to not always hold. A counterexample, provided by Daniel Altman, involves the four linear forms

\displaystyle  \psi_1(n,r) = r; \psi_2(n,r) = 2n+2r; \psi_3(n,r) = n+3r; \psi_4(n,r) = n.

Here it turns out that

\displaystyle  \Psi^{[1]} = \{ (a,b,c,d): d-c=3a; b-2a=2d\}


\displaystyle  \Psi^{[2]} = \{ (a,b,c,d): 24a+3b-4c-8d=0 \}

and {\Psi^{[1]}} is no longer contained in {\Psi^{[2]}}. The analogue of the asymptotic formula given previously for {f(n) = F( \alpha n, \beta n^2 + \gamma n)} is then valid when {\gamma} vanishes, but not when {\gamma} is non-zero, because the identity

\displaystyle  24 (\beta \psi_1(n,r)^2 + \gamma \psi_1(n,r)) + 3 (\beta \psi_2(n,r)^2 + \gamma \psi_2(n,r))

\displaystyle - 4 (\beta \psi_3(n,r)^2 + \gamma \psi_3(n,r)) - 8 (\beta \psi_4(n,r)^2 + \gamma \psi_4(n,r)) = 0

holds in the former case but not the latter. Thus the output of any purported arithmetic regularity lemma in this case is now sensitive to the lower order terms of the nilsequence and cannot be described in a uniform fashion for all “irrational” sequences. There should still be some sort of formula for the asymptotics from the general equidistribution theory of nilsequences, but it could be considerably more complicated than what is presented in this paper.

Fortunately, the flag property does hold in several key cases, most notably the translation invariant case when {\Psi^{[1]}} contains {(1,\dots,1)}, as well as “complexity one” cases. Nevertheless non-flag property systems of affine forms do exist, thus limiting the range of applicability of the techniques in this paper. In particular, the conjecture of Gowers and Wolf (Theorem 1.13 in the paper) is now open again in the non-flag property case.