You are currently browsing the monthly archive for December 2019.

Let {u: {\bf R}^3 \rightarrow {\bf R}^3} be a divergence-free vector field, thus {\nabla \cdot u = 0}, which we interpret as a velocity field. In this post we will proceed formally, largely ignoring the analytic issues of whether the fields in question have sufficient regularity and decay to justify the calculations. The vorticity field {\omega: {\bf R}^3 \rightarrow {\bf R}^3} is then defined as the curl of the velocity:

\displaystyle  \omega = \nabla \times u.

(From a differential geometry viewpoint, it would be more accurate (especially in other dimensions than three) to define the vorticity as the exterior derivative {\omega = d(g \cdot u)} of the musical isomorphism {g \cdot u} of the Euclidean metric {g} applied to the velocity field {u}; see these previous lecture notes. However, we will not need this geometric formalism in this post.)

Assuming suitable regularity and decay hypotheses of the velocity field {u}, it is possible to recover the velocity from the vorticity as follows. From the general vector identity {\nabla \times \nabla \times X = \nabla(\nabla \cdot X) - \Delta X} applied to the velocity field {u}, we see that

\displaystyle  \nabla \times \omega = -\Delta u

and thus (by the commutativity of all the differential operators involved)

\displaystyle  u = - \nabla \times \Delta^{-1} \omega.

Using the Newton potential formula

\displaystyle  -\Delta^{-1} \omega(x) := \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y)}{|x-y|}\ dy

and formally differentiating under the integral sign, we obtain the Biot-Savart law

\displaystyle  u(x) = \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy. \ \ \ \ \ (1)

This law is of fundamental importance in the study of incompressible fluid equations, such as the Euler equations

\displaystyle  \partial_t u + (u \cdot \nabla) u = -\nabla p; \quad \nabla \cdot u = 0

since on applying the curl operator one obtains the vorticity equation

\displaystyle  \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (2)

and then by substituting (1) one gets an autonomous equation for the vorticity field {\omega}. Unfortunately, this equation is non-local, due to the integration present in (1).

In a recent work, it was observed by Elgindi that in a certain regime, the Biot-Savart law can be approximated by a more “low rank” law, which makes the non-local effects significantly simpler in nature. This simplification was carried out in spherical coordinates, and hinged on a study of the invertibility properties of a certain second order linear differential operator in the latitude variable {\theta}; however in this post I would like to observe that the approximation can also be seen directly in Cartesian coordinates from the classical Biot-Savart law (1). As a consequence one can also initiate the beginning of Elgindi’s analysis in constructing somewhat regular solutions to the Euler equations that exhibit self-similar blowup in finite time, though I have not attempted to execute the entirety of the analysis in this setting.

Elgindi’s approximation applies under the following hypotheses:

  • (i) (Axial symmetry without swirl) The velocity field {u} is assumed to take the form

    \displaystyle  u(x_1,x_2,x_3) = ( u_r(r,x_3) \frac{x_1}{r}, u_r(r,x_3) \frac{x_2}{r}, u_3(r,x_3) ) \ \ \ \ \ (3)

    for some functions {u_r, u_3: [0,+\infty) \times {\bf R} \rightarrow {\bf R}} of the cylindrical radial variable {r := \sqrt{x_1^2+x_2^2}} and the vertical coordinate {x_3}. As a consequence, the vorticity field {\omega} takes the form

    \displaystyle  \omega(x_1,x_2,x_3) = (\omega_{r3}(r,x_3) \frac{x_2}{r}, \omega_{r3}(r,x_3) \frac{-x_1}{r}, 0) \ \ \ \ \ (4)

    where {\omega_{r3}: [0,+\infty) \times {\bf R} \rightarrow {\bf R}} is the field

    \displaystyle  \omega_{r3} = \partial_r u_3 - \partial_3 u_r.

  • (ii) (Odd symmetry) We assume that {u_3(r,-x_3) = -u_3(r,x_3)} and {u_r(r,-x_3)=u_r(r,x_3)}, so that {\omega_{r3}(r,-x_3)=\omega_{r3}(r,x_3)}.

A model example of a divergence-free vector field obeying these properties (but without good decay at infinity) is the linear vector field

\displaystyle  X(x) = (x_1, x_2, -2x_3) \ \ \ \ \ (5)

which is of the form (3) with {u_r(r,x_3) = r} and {u_3(r,x_3) = -2x_3}. The associated vorticity {\omega} vanishes.

We can now give an illustration of Elgindi’s approximation:

Proposition 1 (Elgindi’s approximation) Under the above hypotheses (and assuing suitable regularity and decay), we have the pointwise bounds

\displaystyle  u(x) = \frac{1}{2} {\mathcal L}_{12}(\omega)(|x|) X(x) + O( |x| \|\omega\|_{L^\infty({\bf R}^3)} )

for any {x \in {\bf R}^3}, where {X} is the vector field (5), and {{\mathcal L}_{12}(\omega): {\bf R}^+ \rightarrow {\bf R}} is the scalar function

\displaystyle  {\mathcal L}_{12}(\omega)(\rho) := \frac{3}{4\pi} \int_{|y| \geq \rho} \frac{r y_3}{|y|^5} \omega_{r3}(r,y_3)\ dy.

Thus under the hypotheses (i), (ii), and assuming that {\omega} is slowly varying, we expect {u} to behave like the linear vector field {X} modulated by a radial scalar function. In applications one needs to control the error in various function spaces instead of pointwise, and with {\omega} similarly controlled in other function space norms than the {L^\infty} norm, but this proposition already gives a flavour of the approximation. If one uses spherical coordinates

\displaystyle  \omega_{r3}( \rho \cos \theta, \rho \sin \theta ) = \Omega( \rho, \theta )

then we have (using the spherical change of variables formula {dy = \rho^2 \cos \theta d\rho d\theta d\phi} and the odd nature of {\Omega})

\displaystyle  {\mathcal L}_{12}(\omega) = L_{12}(\Omega),

where

\displaystyle L_{12}(\Omega)(\rho) = 3 \int_\rho^\infty \int_0^{\pi/2} \frac{\Omega(r, \theta) \sin(\theta) \cos^2(\theta)}{r}\ d\theta dr

is the operator introduced in Elgindi’s paper.

Proof: By a limiting argument we may assume that {x} is non-zero, and we may normalise {\|\omega\|_{L^\infty({\bf R}^3)}=1}. From the triangle inequality we have

\displaystyle  \int_{|y| \leq 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy \leq \int_{|y| \leq 10|x|} \frac{1}{|x-y|^2}\ dy

\displaystyle  \leq \int_{|z| \leq 11 |x|} \frac{1}{|z|^2}\ dz

\displaystyle  = O( |x| )

and hence by (1)

\displaystyle  u(x) = \frac{1}{4\pi} \int_{|y| > 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy + O(|x|).

In the regime {|y| > 2|x|} we may perform the Taylor expansion

\displaystyle  \frac{x-y}{|x-y|^3} = \frac{x-y}{|y|^3} (1 - \frac{2 x \cdot y}{|y|^2} + \frac{|x|^2}{|y|^2})^{-3/2}

\displaystyle  = \frac{x-y}{|y|^3} (1 + \frac{3 x \cdot y}{|y|^2} + O( \frac{|x|^2}{|y|^2} ) )

\displaystyle  = -\frac{y}{|y|^3} + \frac{x}{|y|^3} - \frac{3 (x \cdot y) y}{|y|^5} + O( \frac{|x|^2}{|y|^4} ).

Since

\displaystyle  \int_{|y| > 10|x|} \frac{|x|^2}{|y|^4}\ dy = O(|x|)

we see from the triangle inequality that the error term contributes {O(|x|)} to {u(x)}. We thus have

\displaystyle  u(x) = -A_0(x) + A_1(x) - 3A'_1(x) + O(|x|)

where {A_0} is the constant term

\displaystyle  A_0 := \int_{|y| > 10|x|} \frac{\omega(y) \times y}{|y|^3}\ dy,

and {A_1, A'_1} are the linear term

\displaystyle  A_1 := \int_{|y| > 10|x|} \frac{\omega(y) \times x}{|y|^3}\ dy,

\displaystyle  A'_1 := \int_{|y| > 10|x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy.

By the hypotheses (i), (ii), we have the symmetries

\displaystyle  \omega(y_1,y_2,-y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (6)

and

\displaystyle  \omega(-y_1,-y_2,y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (7)

and hence also

\displaystyle  \omega(-y_1,-y_2,-y_3) = \omega(y_1,y_2,y_3). \ \ \ \ \ (8)

The even symmetry (8) ensures that the integrand in {A_0} is odd, so {A_0} vanishes. The symmetry (6) or (7) similarly ensures that {\int_{|y| > 10|x|} \frac{\omega(y)}{|y|^3}\ dy = 0}, so {A_1} vanishes. Since {\int_{|x| < y \leq 10|x|} \frac{|x \cdot y| |y|}{|y|^5}\ dy = O( |x| )}, we conclude that

\displaystyle  \omega(x) = -3\int_{|y| \geq |x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy + O(|x|).

Using (4), the right-hand side is

\displaystyle  -3\int_{|y| \geq |x|} (x_1 y_1 + x_2 y_2 + x_3 y_3) \frac{\omega_{r3}(r,y_3) (-y_1 y_3, -y_2 y_3, y_1^2+y_2^2)}{r|y|^5}\ dy

\displaystyle + O(|x|)

where {r := \sqrt{y_1^2+y_2^2}}. Because of the odd nature of {\omega_{r3}}, only those terms with one factor of {y_3} give a non-vanishing contribution to the integral. Using the rotation symmetry {(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)} we also see that any term with a factor of {y_1 y_2} also vanishes. We can thus simplify the above expression as

\displaystyle  -3\int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (-x_1 y_1^2 y_3, -x_2 y_2^2 y_3, x_3 (y_1^2+y_2^2) y_3)}{r|y|^5}\ dy + O(|x|).

Using the rotation symmetry {(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)} again, we see that the term {y_1^2} in the first component can be replaced by {y_2^2} or by {\frac{1}{2} (y_1^2+y_2^2) = \frac{r^2}{2}}, and similarly for the {y_2^2} term in the second component. Thus the above expression is

\displaystyle  \frac{3}{2} \int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (x_1 , x_2, -2x_3) r y_3}{|y|^5}\ dy + O(|x|)

giving the claim. \Box

Example 2 Consider the divergence-free vector field {u := \nabla \times \psi}, where the vector potential {\psi} takes the form

\displaystyle  \psi(x_1,x_2,x_3) := (x_2 x_3, -x_1 x_3, 0) \eta(|x|)

for some bump function {\eta: {\bf R} \rightarrow {\bf R}} supported in {(0,+\infty)}. We can then calculate

\displaystyle  u(x_1,x_2,x_3) = X(x) \eta(|x|) + (x_1 x_3, x_2 x_3, -x_1^2-x_2^2) \frac{\eta'(|x|) x_3}{|x|}.

and

\displaystyle  \omega(x_1,x_2,x_3) = (-6x_2 x_3, 6x_1 x_3, 0) \frac{\eta'(|x|)}{|x|} + (-x_2 x_3, x_1 x_3, 0) \eta''(|x|).

In particular the hypotheses (i), (ii) are satisfied with

\displaystyle  \omega_{r3}(r,x_3) = - 6 \eta'(|x|) \frac{x_3 r}{|x|} - \eta''(|x|) x_3 r.

One can then calculate

\displaystyle  L_{12}(\omega)(\rho) = -\frac{3}{4\pi} \int_{|y| \geq \rho} (6\frac{\eta'(|y|)}{|y|^6} + \frac{\eta''(|y|)}{|y|^5}) r^2 y_3^2\ dy

\displaystyle  = -\frac{2}{5} \int_\rho^\infty 6\eta'(s) + s\eta''(s)\ ds

\displaystyle  = 2\eta(\rho) + \frac{2}{5} \rho \eta'(\rho).

If we take the specific choice

\displaystyle  \eta(\rho) = \varphi( \rho^\alpha )

where {\varphi} is a fixed bump function supported some interval {[c,C] \subset (0,+\infty)} and {\alpha>0} is a small parameter (so that {\eta} is spread out over the range {\rho \in [c^{1/\alpha},C^{1/\alpha}]}), then we see that

\displaystyle  \| \omega \|_{L^\infty} = O( \alpha )

(with implied constants allowed to depend on {\varphi}),

\displaystyle  L_{12}(\omega)(\rho) = 2\eta(\rho) + O(\alpha),

and

\displaystyle  u = X(x) \eta(|x|) + O( \alpha |x| ),

which is completely consistent with Proposition 1.

One can use this approximation to extract a plausible ansatz for a self-similar blowup to the Euler equations. We let {\alpha>0} be a small parameter and let {\omega_{rx_3}} be a time-dependent vorticity field obeying (i), (ii) of the form

\displaystyle  \omega_{rx_3}(t,r,x_3) \approx \alpha \Omega( t, R ) \mathrm{sgn}(x_3)

where {R := |x|^\alpha = (r^2+x_3^2)^{\alpha/2}} and {\Omega: {\bf R} \times [0,+\infty) \rightarrow {\bf R}} is a smooth field to be chosen later. Admittedly the signum function {\mathrm{sgn}} is not smooth at {x_3}, but let us ignore this issue for now (to rigorously make an ansatz one will have to smooth out this function a little bit; Elgindi uses the choice {(|\sin \theta| \cos^2 \theta)^{\alpha/3} \mathrm{sgn}(x_3)}, where {\theta := \mathrm{arctan}(x_3/r)}). With this ansatz one may compute

\displaystyle  {\mathcal L}_{12}(\omega(t))(\rho) \approx \frac{3\alpha}{2\pi} \int_{|y| \geq \rho; y_3 \geq 0} \Omega(t,R) \frac{r y_3}{|y|^5}\ dy

\displaystyle  = \alpha \int_\rho^\infty \Omega(t, s^\alpha) \frac{ds}{s}

\displaystyle  = \int_{\rho^\alpha}^\infty \Omega(t,s) \frac{ds}{s}.

By Proposition 1, we thus expect to have the approximation

\displaystyle  u(t,x) \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} X(x).

We insert this into the vorticity equation (2). The transport term {(u \cdot \nabla) \omega} will be expected to be negligible because {R}, and hence {\omega_{rx_3}}, is slowly varying (the discontinuity of {\mathrm{sgn}(x_3)} will not be encountered because the vector field {X} is parallel to this singularity). The modulating function {\frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s}} is similarly slowly varying, so derivatives falling on this function should be lower order. Neglecting such terms, we arrive at the approximation

\displaystyle  (\omega \cdot \nabla) u \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} \omega

and so in the limit {\alpha \rightarrow 0} we expect obtain a simple model equation for the evolution of the vorticity envelope {\Omega}:

\displaystyle  \partial_t \Omega(t,R) = \frac{1}{2} \int_R^\infty \Omega(t,S) \frac{dS}{S} \Omega(t,R).

If we write {L(t,R) := \int_R^\infty \Omega(t,S)\frac{dS}{S}} for the logarithmic primitive of {\Omega}, then we have {\Omega = - R \partial_R L} and hence

\displaystyle  \partial_t (R \partial_R L) = \frac{1}{2} L (R \partial_R L)

which integrates to the Ricatti equation

\displaystyle  \partial_t L = \frac{1}{4} L^2

which can be explicitly solved as

\displaystyle  L(t,R) = \frac{2}{f(R) - t/2}

where {f(R)} is any function of {R} that one pleases. (In Elgindi’s work a time dilation is used to remove the unsightly factor of {1/2} appearing here in the denominator.) If for instance we set {f(R) = 1+R}, we obtain the self-similar solution

\displaystyle  L(t,R) = \frac{2}{1+R-t/2}

and then on applying {-R \partial_R}

\displaystyle  \Omega(t,R) = \frac{2R}{(1+R-t/2)^2}.

Thus, we expect to be able to construct a self-similar blowup to the Euler equations with a vorticity field approximately behaving like

\displaystyle  \omega(t,x) \approx \alpha \frac{2R}{(1+R-t/2)^2} \mathrm{sgn}(x_3) (\frac{x_2}{r}, -\frac{x_1}{r}, 0)

and velocity field behaving like

\displaystyle  u(t,x) \approx \frac{1}{1+R-t/2} X(x).

In particular, {u} would be expected to be of regularity {C^{1,\alpha}} (and smooth away from the origin), and blows up in (say) {L^\infty} norm at time {t/2 = 1}, and one has the self-similarity

\displaystyle  u(t,x) = (1-t/2)^{\frac{1}{\alpha}-1} u( 0, \frac{x}{(1-t/2)^{1/\alpha}} )

and

\displaystyle  \omega(t,x) = (1-t/2)^{-1} \omega( 0, \frac{x}{(1-t/2)^{1/\alpha}} ).

A self-similar solution of this approximate shape is in fact constructed rigorously in Elgindi’s paper (using spherical coordinates instead of the Cartesian approach adopted here), using a nonlinear stability analysis of the above ansatz. It seems plausible that one could also carry out this stability analysis using this Cartesian coordinate approach, although I have not tried to do this in detail.

Let us call an arithmetic function {f: {\bf N} \rightarrow {\bf C}} {1}-bounded if we have {|f(n)| \leq 1} for all {n \in {\bf N}}. In this section we focus on the asymptotic behaviour of {1}-bounded multiplicative functions. Some key examples of such functions include:

  • The Möbius function {\mu};
  • The Liouville function {\lambda};
  • Archimedean” characters {n \mapsto n^{it}} (which I call Archimedean because they are pullbacks of a Fourier character {x \mapsto x^{it}} on the multiplicative group {{\bf R}^+}, which has the Archimedean property);
  • Dirichlet characters (or “non-Archimedean” characters) {\chi} (which are essentially pullbacks of Fourier characters on a multiplicative cyclic group {({\bf Z}/q{\bf Z})^\times} with the discrete (non-Archimedean) metric);
  • Hybrid characters {n \mapsto \chi(n) n^{it}}.

The space of {1}-bounded multiplicative functions is also closed under multiplication and complex conjugation.
Given a multiplicative function {f}, we are often interested in the asymptotics of long averages such as

\displaystyle  \frac{1}{X} \sum_{n \leq X} f(n)

for large values of {X}, as well as short sums

\displaystyle  \frac{1}{H} \sum_{x \leq n \leq x+H} f(n)

where {H} and {x} are both large, but {H} is significantly smaller than {x}. (Throughout these notes we will try to normalise most of the sums and integrals appearing here as averages that are trivially bounded by {O(1)}; note that other normalisations are preferred in some of the literature cited here.) For instance, as we established in Theorem 58 of Notes 1, the prime number theorem is equivalent to the assertion that

\displaystyle  \frac{1}{X} \sum_{n \leq X} \mu(n) = o(1) \ \ \ \ \ (1)

as {X \rightarrow \infty}. The Liouville function behaves almost identically to the Möbius function, in that estimates for one function almost always imply analogous estimates for the other:

Exercise 1 Without using the prime number theorem, show that (1) is also equivalent to

\displaystyle  \frac{1}{X} \sum_{n \leq X} \lambda(n) = o(1) \ \ \ \ \ (2)

as {X \rightarrow \infty}. (Hint: use the identities {\lambda(n) = \sum_{d^2|n} \mu(n/d^2)} and {\mu(n) = \sum_{d^2|n} \mu(d) \lambda(n/d^2)}.)

Henceforth we shall focus our discussion more on the Liouville function, and turn our attention to averages on shorter intervals. From (2) one has

\displaystyle  \frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n) = o(1) \ \ \ \ \ (3)

as {x \rightarrow \infty} if {H = H(x)} is such that {H \geq \varepsilon x} for some fixed {\varepsilon>0}. However it is significantly more difficult to understand what happens when {H} grows much slower than this. By using the techniques based on zero density estimates discussed in Notes 6, it was shown by Motohashi and that one can also establish \eqref. On the Riemann Hypothesis Maier and Montgomery lowered the threshold to {H \geq x^{1/2} \log^C x} for an absolute constant {C} (the bound {H \geq x^{1/2+\varepsilon}} is more classical, following from Exercise 33 of Notes 2). On the other hand, the randomness heuristics from Supplement 4 suggest that {H} should be able to be taken as small as {x^\varepsilon}, and perhaps even {\log^{1+\varepsilon} x} if one is particularly optimistic about the accuracy of these probabilistic models. On the other hand, the Chowla conjecture (mentioned for instance in Supplement 4) predicts that {H} cannot be taken arbitrarily slowly growing in {x}, due to the conjectured existence of arbitrarily long strings of consecutive numbers where the Liouville function does not change sign (and in fact one can already show from the known partial results towards the Chowla conjecture that (3) fails for some sequence {x \rightarrow \infty} and some sufficiently slowly growing {H = H(x)}, by modifying the arguments in these papers of mine).
The situation is better when one asks to understand the mean value on almost all short intervals, rather than all intervals. There are several equivalent ways to formulate this question:

Exercise 2 Let {H = H(X)} be a function of {X} such that {H \rightarrow \infty} and {H = o(X)} as {X \rightarrow \infty}. Let {f: {\bf N} \rightarrow {\bf C}} be a {1}-bounded function. Show that the following assertions are equivalent:

  • (i) One has

    \displaystyle  \frac{1}{H} \sum_{x \leq n \leq x+H} f(n) = o(1)

    as {X \rightarrow \infty}, uniformly for all {x \in [X,2X]} outside of a set of measure {o(X)}.

  • (ii) One has

    \displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|\ dx = o(1)

    as {X \rightarrow \infty}.

  • (iii) One has

    \displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx = o(1) \ \ \ \ \ (4)

    as {X \rightarrow \infty}.

As it turns out the second moment formulation in (iii) will be the most convenient for us to work with in this set of notes, as it is well suited to Fourier-analytic techniques (and in particular the Plancherel theorem).
Using zero density methods, for instance, it was shown by Ramachandra that

\displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)|^2\ dx \ll_{A,\varepsilon} \log^{-A} X

whenever {X^{1/6+\varepsilon} \leq H \leq X} and {\varepsilon>0}. With this quality of bound (saving arbitrary powers of {\log X} over the trivial bound of {O(1)}), this is still the lowest value of {H} one can reach unconditionally. However, in a striking recent breakthrough, it was shown by Matomaki and Radziwill that as long as one is willing to settle for weaker bounds (saving a small power of {\log X} or {\log H}, or just a qualitative decay of {o(1)}), one can obtain non-trivial estimates on far shorter intervals. For instance, they show

Theorem 3 (Matomaki-Radziwill theorem for Liouville) For any {2 \leq H \leq X}, one has

\displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)|^2\ dx \ll \log^{-c} H

for some absolute constant {c>0}.

In fact they prove a slightly more precise result: see Theorem 1 of that paper. In particular, they obtain the asymptotic (4) for any function {H = H(X)} that goes to infinity as {X \rightarrow \infty}, no matter how slowly! This ability to let {H} grow slowly with {X} is important for several applications; for instance, in order to combine this type of result with the entropy decrement methods from Notes 9, it is essential that {H} be allowed to grow more slowly than {\log X}. See also this survey of Soundararajan for further discussion.

Exercise 4 In this exercise you may use Theorem 3 freely.

  • (i) Establish the lower bound

    \displaystyle  \frac{1}{X} \sum_{n \leq X} \lambda(n)\lambda(n+1) > -1+c

    for some absolute constant {c>0} and all sufficiently large {X}. (Hint: if this bound failed, then {\lambda(n)=\lambda(n+1)} would hold for almost all {n}; use this to create many intervals {[x,x+H]} for which {\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)} is extremely large.)

  • (ii) Show that Theorem 3 also holds with {\lambda(n)} replaced by {\chi_2 \lambda(n)}, where {\chi_2} is the principal character of period {2}. (Use the fact that {\lambda(2n)=-\lambda(n)} for all {n}.) Use this to establish the corresponding upper bound

    \displaystyle  \frac{1}{X} \sum_{n \leq X} \lambda(n)\lambda(n+1) < 1-c

    to (i).

(There is a curious asymmetry to the difficulty level of these bounds; the upper bound in (ii) was established much earlier by Harman, Pintz, and Wolke, but the lower bound in (i) was only established in the Matomaki-Radziwill paper.)

The techniques discussed previously were highly complex-analytic in nature, relying in particular on the fact that functions such as {\mu} or {\lambda} have Dirichlet series {{\mathcal D} \mu(s) = \frac{1}{\zeta(s)}}, {{\mathcal D} \lambda(s) = \frac{\zeta(2s)}{\zeta(s)}} that extend meromorphically into the critical strip. In contrast, the Matomaki-Radziwill theorem does not rely on such meromorphic continuations, and in fact holds for more general classes of {1}-bounded multiplicative functions {f}, for which one typically does not expect any meromorphic continuation into the strip. Instead, one can view the Matomaki-Radziwill theory as following the philosophy of a slightly different approach to multiplicative number theory, namely the pretentious multiplicative number theory of Granville and Soundarajan (as presented for instance in their draft monograph). A basic notion here is the pretentious distance between two {1}-bounded multiplicative functions {f,g} (at a given scale {X}), which informally measures the extent to which {f} “pretends” to be like {g} (or vice versa). The precise definition is

Definition 5 (Pretentious distance) Given two {1}-bounded multiplicative functions {f,g}, and a threshold {X>0}, the pretentious distance {\mathbb{D}(f,g;X)} between {f} and {g} up to scale {X} is given by the formula

\displaystyle  \mathbb{D}(f,g;X) := \left( \sum_{p \leq X} \frac{1 - \mathrm{Re}(f(p) \overline{g(p)})}{p} \right)^{1/2}

Note that one can also define an infinite version {\mathbb{D}(f,g;\infty)} of this distance by removing the constraint {p \leq X}, though in such cases the pretentious distance may then be infinite. The pretentious distance is not quite a metric (because {\mathbb{D}(f,f;X)} can be non-zero, and furthermore {\mathbb{D}(f,g;X)} can vanish without {f,g} being equal), but it is still quite close to behaving like a metric, in particular it obeys the triangle inequality; see Exercise 16 below. The philosophy of pretentious multiplicative number theory is that two {1}-bounded multiplicative functions {f,g} will exhibit similar behaviour at scale {X} if their pretentious distance {\mathbb{D}(f,g;X)} is bounded, but will become uncorrelated from each other if this distance becomes large. A simple example of this philosophy is given by the following “weak Halasz theorem”, proven in Section 2:

Proposition 6 (Logarithmically averaged version of Halasz) Let {X} be sufficiently large. Then for any {1}-bounded multiplicative functions {f,g}, one has

\displaystyle  \frac{1}{\log X} \sum_{n \leq X} \frac{f(n) \overline{g(n)}}{n} \ll \exp( - c \mathbb{D}(f, g;X)^2 )

for an absolute constant {c>0}.

In particular, if {f} does not pretend to be {1}, then the logarithmic average {\frac{1}{\log X} \sum_{n \leq X} \frac{f(n)}{n}} will be small. This condition is basically necessary, since of course {\frac{1}{\log X} \sum_{n \leq X} \frac{1}{n} = 1 + o(1)}.
If one works with non-logarithmic averages {\frac{1}{X} \sum_{n \leq X} f(n)}, then not pretending to be {1} is insufficient to establish decay, as was already observed in Exercise 11 of Notes 1: if {f} is an Archimedean character {f(n) = n^{it}} for some non-zero real {t}, then {\frac{1}{\log X} \sum_{n \leq X} \frac{f(n)}{n}} goes to zero as {X \rightarrow \infty} (which is consistent with Proposition 6), but {\frac{1}{X} \sum_{n \leq X} f(n)} does not go to zero. However, this is in some sense the “only” obstruction to these averages decaying to zero, as quantified by the following basic result:

Theorem 7 (Halasz’s theorem) Let {X} be sufficiently large. Then for any {1}-bounded multiplicative function {f}, one has

\displaystyle  \frac{1}{X} \sum_{n \leq X} f(n) \ll \exp( - c \min_{|t| \leq T} \mathbb{D}(f, n \mapsto n^{it};X)^2 ) + \frac{1}{T}

for an absolute constant {c>0} and any {T > 0}.

Informally, we refer to a {1}-bounded multiplicative function as “pretentious’; if it pretends to be a character such as {n^{it}}, and “non-pretentious” otherwise. The precise distinction is rather malleable, as the precise class of characters that one views as “obstructions” varies from situation to situation. For instance, in Proposition 6 it is just the trivial character {1} which needs to be considered, but in Theorem 7 it is the characters {n \mapsto n^{it}} with {|t| \leq T}. In other contexts one may also need to add Dirichlet characters {\chi(n)} or hybrid characters such as {\chi(n) n^{it}} to the list of characters that one might pretend to be. The division into pretentious and non-pretentious functions in multiplicative number theory is faintly analogous to the division into major and minor arcs in the circle method applied to additive number theory problems; see Notes 8. The Möbius and Liouville functions are model examples of non-pretentious functions; see Exercise 24.
In the contrapositive, Halasz’ theorem can be formulated as the assertion that if one has a large mean

\displaystyle  |\frac{1}{X} \sum_{n \leq X} f(n)| \geq \eta

for some {\eta > 0}, then one has the pretentious property

\displaystyle  \mathbb{D}( f, n \mapsto n^{it}; X ) \ll \sqrt{\log(1/\eta)}

for some {t \ll \eta^{-1}}. This has the flavour of an “inverse theorem”, of the type often found in arithmetic combinatorics.
Among other things, Halasz’s theorem gives yet another proof of the prime number theorem (1); see Section 2.
We now give a version of the Matomaki-Radziwill theorem for general (non-pretentious) multiplicative functions that is formulated in a similar contrapositive (or “inverse theorem”) fashion, though to simplify the presentation we only state a qualitative version that does not give explicit bounds.

Theorem 8 ((Qualitative) Matomaki-Radziwill theorem) Let {\eta>0}, and let {1 \leq H \leq X}, with {H} sufficiently large depending on {\eta}. Suppose that {f} is a {1}-bounded multiplicative function such that

\displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx \geq \eta^2.

Then one has

\displaystyle  \mathbb{D}(f, n \mapsto n^{it};X) \ll_\eta 1

for some {t \ll_\eta \frac{X}{H}}.

The condition {t \ll_\eta \frac{X}{H}} is basically optimal, as the following example shows:

Exercise 9 Let {\varepsilon>0} be a sufficiently small constant, and let {1 \leq H \leq X} be such that {\frac{1}{\varepsilon} \leq H \leq \varepsilon X}. Let {f} be the Archimedean character {f(n) = n^{it}} for some {|t| \leq \varepsilon \frac{X}{H}}. Show that

\displaystyle  \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx \asymp 1.

Combining Theorem 8 with standard non-pretentiousness facts about the Liouville function (see Exercise 24), we recover Theorem 3 (but with a decay rate of only {o(1)} rather than {\log^{-c} H}). We refer the reader to the original paper of Matomaki-Radziwill (as well as this followup paper with myself) for the quantitative version of Theorem 8 that is strong enough to recover the full version of Theorem 3, and which can also handle real-valued pretentious functions.
With our current state of knowledge, the only arguments that can establish the full strength of Halasz and Matomaki-Radziwill theorems are Fourier analytic in nature, relating sums involving an arithmetic function {f} with its Dirichlet series

\displaystyle  {\mathcal D} f(s) := \sum_{n=1}^\infty \frac{f(n)}{n^s}

which one can view as a discrete Fourier transform of {f} (or more precisely of the measure {\sum_{n=1}^\infty \frac{f(n)}{n} \delta_{\log n}}, if one evaluates the Dirichlet series on the right edge {\{ 1+it: t \in {\bf R} \}} of the critical strip). In this aspect, the techniques resemble the complex-analytic methods from Notes 2, but with the key difference that no analytic or meromorphic continuation into the strip is assumed. The key identity that allows us to pass to Dirichlet series is the following variant of Proposition 7 of Notes 2:

Proposition 10 (Parseval type identity) Let {f,g: {\bf N} \rightarrow {\bf C}} be finitely supported arithmetic functions, and let {\psi: {\bf R} \rightarrow {\bf R}} be a Schwartz function. Then

\displaystyle  \sum_{n=1}^\infty \sum_{m=1}^\infty \frac{f(n)}{n} \frac{\overline{g(m)}}{m} \psi(\log n - \log m) = \frac{1}{2\pi} \int_{\bf R} {\mathcal D} f(1+it) \overline{{\mathcal D} g(1+it)} \hat \psi(t)\ dt

where {\hat \psi(t) := \int_{\bf R} \psi(u) e^{itu}\ du} is the Fourier transform of {\psi}. (Note that the finite support of {f,g} and the Schwartz nature of {\psi,\hat \psi} ensure that both sides of the identity are absolutely convergent.)

The restriction that {f,g} be finitely supported will be slightly annoying in places, since most multiplicative functions will fail to be finitely supported, but this technicality can usually be overcome by suitably truncating the multiplicative function, and taking limits if necessary.
Proof: By expanding out the Dirichlet series, it suffices to show that

\displaystyle  \psi(\log n - \log m) = \frac{1}{2\pi} \int_{\bf R} \frac{1}{n^{it}} \frac{1}{m^{-it}} \hat \psi(t)\ dt

for any natural numbers {n,m}. But this follows from the Fourier inversion formula {\psi(u) = \frac{1}{2\pi} \int_{\bf R} e^{-itu} \hat \psi(t)\ dt} applied at {u = \log n - \log m}. \Box
For applications to Halasz type theorems, one sets {g(n)} equal to the Kronecker delta {\delta_{n=1}}, producing weighted integrals of {{\mathcal D} f(1+it)} of “{L^1}” type. For applications to Matomaki-Radziwill theorems, one instead sets {f=g}, and more precisely uses the following corollary of the above proposition, to obtain weighted integrals of {|{\mathcal D} f(1+it)|^2} of “{L^2}” type:

Exercise 11 (Plancherel type identity) If {f: {\bf N} \rightarrow {\bf C}} is finitely supported, and {\varphi: {\bf R} \rightarrow {\bf R}} is a Schwartz function, establish the identity

\displaystyle  \int_0^\infty |\sum_{n=1}^\infty \frac{f(n)}{n} \varphi(\log n - \log y)|^2 \frac{dy}{y} = \frac{1}{2\pi} \int_{\bf R} |{\mathcal D} f(1+it)|^2 |\hat \varphi(t)|^2\ dt.

In contrast, information about the non-pretentious nature of a multiplicative function {f} will give “pointwise” or “{L^\infty}” type control on the Dirichlet series {{\mathcal D} f(1+it)}, as is suggested from the Euler product factorisation of {{\mathcal D} f}.
It will be convenient to formalise the notion of {L^1}, {L^2}, and {L^\infty} control of the Dirichlet series {{\mathcal D} f}, which as previously mentioned can be viewed as a sort of “Fourier transform” of {f}:

Definition 12 (Fourier norms) Let {f: {\bf N} \rightarrow {\bf C}} be finitely supported, and let {\Omega \subset {\bf R}} be a bounded measurable set. We define the Fourier {L^\infty} norm

\displaystyle  \| f\|_{FL^\infty(\Omega)} := \sup_{t \in \Omega} |{\mathcal D} f(1+it)|,

the Fourier {L^2} norm

\displaystyle  \| f\|_{FL^2(\Omega)} := \left(\int_\Omega |{\mathcal D} f(1+it)|^2\ dt\right)^{1/2},

and the Fourier {L^1} norm

\displaystyle  \| f\|_{FL^1(\Omega)} := \int_\Omega |{\mathcal D} f(1+it)|\ dt.

One could more generally define {FL^p} norms for other exponents {p}, but we will only need the exponents {p=1,2,\infty} in this current set of notes. It is clear that all the above norms are in fact (semi-)norms on the space of finitely supported arithmetic functions.
As mentioned above, Halasz’s theorem gives good control on the Fourier {L^\infty} norm for restrictions of non-pretentious functions to intervals:

Exercise 13 (Fourier {L^\infty} control via Halasz) Let {f: {\bf N} \rightarrow {\bf C}} be a {1}-bounded multiplicative function, let {I} be an interval in {[C^{-1} X, CX]} for some {X \geq C \geq 1}, let {R \geq 1}, and let {\Omega \subset {\bf R}} be a bounded measurable set. Show that

\displaystyle  \| f 1_I \|_{FL^\infty(\Omega)} \ll_C \exp( - c \min_{t: \mathrm{dist}(t,\Omega) \leq R} \mathbb{D}(f, n \mapsto n^{it};X)^2 ) + \frac{1}{R}.

(Hint: you will need to use summation by parts (or an equivalent device) to deal with a {\frac{1}{n}} weight.)

Meanwhile, the Plancherel identity in Exercise 11 gives good control on the Fourier {L^2} norm for functions on long intervals (compare with Exercise 2 from Notes 6):

Exercise 14 ({L^2} mean value theorem) Let {T \geq 1}, and let {f: {\bf N} \rightarrow {\bf C}} be finitely supported. Show that

\displaystyle  \| f \|_{FL^2([-T,T])}^2 \ll \sum_n \frac{1}{n} (\frac{T}{n} \sum_{m: |n-m| \leq n/T} |f(m)|)^2.

Conclude in particular that if {f} is supported in {[C^{-1} N, C N]} for some {C \geq 1} and {N \gg T}, then

\displaystyle  \| f \|_{FL^2([-T,T])}^2 \ll C^{O(1)} \frac{1}{N} \sum_n |f(n)|^2.

In the simplest case of the logarithmically averaged Halasz theorem (Proposition 6), Fourier {L^\infty} estimates are already sufficient to obtain decent control on the (weighted) Fourier {L^1} type expressions that show up. However, these estimates are not enough by themselves to establish the full Halasz theorem or the Matomaki-Radziwill theorem. To get from Fourier {L^\infty} control to Fourier {L^1} or {L^2} control more efficiently, the key trick is use Hölder’s inequality, which when combined with the basic Dirichlet series identity

\displaystyle  {\mathcal D}(f*g) = ({\mathcal D} f) ({\mathcal D} g)

gives the inequalities

\displaystyle  \| f*g \|_{FL^1(\Omega)} \leq \|f\|_{FL^2(\Omega)} \|g\|_{FL^2(\Omega)} \ \ \ \ \ (5)

and

\displaystyle  \| f*g \|_{FL^2(\Omega)} \leq \|f\|_{FL^2(\Omega)} \|g\|_{FL^\infty(\Omega)} \ \ \ \ \ (6)

The strategy is then to factor (or approximately factor) the original function {f} as a Dirichlet convolution (or average of convolutions) of various components, each of which enjoys reasonably good Fourier {L^2} or {L^\infty} estimates on various regions {\Omega}, and then combine them using the Hölder inequalities (5), (6) and the triangle inequality. For instance, to prove Halasz’s theorem, we will split {f} into the Dirichlet convolution of three factors, one of which will be estimated in {FL^\infty} using the non-pretentiousness hypothesis, and the other two being estimated in {FL^2} using Exercise 14. For the Matomaki-Radziwill theorem, one uses a significantly more complicated decomposition of {f} into a variety of Dirichlet convolutions of factors, and also splits up the Fourier domain {[-T,T]} into several subregions depending on whether the Dirichlet series associated to some of these components are large or small. In each region and for each component of these decompositions, all but one of the factors will be estimated in {FL^\infty}, and the other in {FL^2}; but the precise way in which this is done will vary from component to component. For instance, in some regions a key factor will be small in {FL^\infty} by construction of the region; in other places, the {FL^\infty} control will come from Exercise 13. Similarly, in some regions, satisfactory {FL^2} control is provided by Exercise 14, but in other regions one must instead use “large value” theorems (in the spirit of Proposition 9 from Notes 6), or amplify the power of the standard {L^2} mean value theorems by combining the Dirichlet series with other Dirichlet series that are known to be large in this region.
There are several ways to achieve the desired factorisation. In the case of Halasz’s theorem, we can simply work with a crude version of the Euler product factorisation, dividing the primes into three categories (“small”, “medium”, and “large” primes) and expressing {f} as a triple Dirichlet convolution accordingly. For the Matomaki-Radziwill theorem, one instead exploits the Turan-Kubilius phenomenon (Section 5 of Notes 1, or Lemma 2 of Notes 9)) that for various moderately wide ranges {[P,Q]} of primes, the number of prime divisors of a large number {n} in the range {[P,Q]} is almost always close to {\log\log Q - \log\log P}. Thus, if we introduce the arithmetic functions

\displaystyle  w_{[P,Q]}(n) = \frac{1}{\log\log Q - \log\log P} \sum_{P \leq p \leq Q} 1_{n=p} \ \ \ \ \ (7)

then we have

\displaystyle  1 \approx 1 * w_{[P,Q]}

and more generally we have a twisted approximation

\displaystyle  f \approx f * fw_{[P,Q]}

for multiplicative functions {f}. (Actually, for technical reasons it will be convenient to work with a smoothed out version of these functions; see Section 3.) Informally, these formulas suggest that the “{FL^2} energy” of a multiplicative function {f} is concentrated in those regions where {f w_{[P,Q]}} is extremely large in a {FL^\infty} sense. Iterations of this formula (or variants of this formula, such as an identity due to Ramaré) will then give the desired (approximate) factorisation of {{\mathcal D} f}.
Read the rest of this entry »

Just a short post to announce that nominations are now open for the Maryam Mirzakhani New Frontiers Prize, which is a newly announced annual $50,000 award from the Breakthrough Prize Foundation presented to early-career, women mathematicians who have completed their PhDs within the past two years, and recognizes outstanding research achievement.  (I will be serving on the prize committee.)  Nominations for this (and other breakthrough prizes) can be made at this page.

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv a completely rewritten version of our previous paper, now titled “Eigenvectors from Eigenvalues: a survey of a basic identity in linear algebra“. This paper is now a survey of the various literature surrounding the following basic identity in linear algebra, which we propose to call the eigenvector-eigenvalue identity:

Theorem 1 (Eigenvector-eigenvalue identity) Let {A} be an {n \times n} Hermitian matrix, with eigenvalues {\lambda_1(A),\dots,\lambda_n(A)}. Let {v_i} be a unit eigenvector corresponding to the eigenvalue {\lambda_i(A)}, and let {v_{i,j}} be the {j^{th}} component of {v_i}. Then

\displaystyle |v_{i,j}|^2 \prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A)) = \prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M_j))

where {M_j} is the {n-1 \times n-1} Hermitian matrix formed by deleting the {j^{th}} row and column from {A}.

When we posted the first version of this paper, we were unaware of previous appearances of this identity in the literature; a related identity had been used by Erdos-Schlein-Yau and by myself and Van Vu for applications to random matrix theory, but to our knowledge this specific identity appeared to be new. Even two months after our preprint first appeared on the arXiv in August, we had only learned of one other place in the literature where the identity showed up (by Forrester and Zhang, who also cite an earlier paper of Baryshnikov).

The situation changed rather dramatically with the publication of a popular science article in Quanta on this identity in November, which gave this result significantly more exposure. Within a few weeks we became informed (through private communication, online discussion, and exploration of the citation tree around the references we were alerted to) of over three dozen places where the identity, or some other closely related identity, had previously appeared in the literature, in such areas as numerical linear algebra, various aspects of graph theory (graph reconstruction, chemical graph theory, and walks on graphs), inverse eigenvalue problems, random matrix theory, and neutrino physics. As a consequence, we have decided to completely rewrite our article in order to collate this crowdsourced information, and survey the history of this identity, all the known proofs (we collect seven distinct ways to prove the identity (or generalisations thereof)), and all the applications of it that we are currently aware of. The citation graph of the literature that this ad hoc crowdsourcing effort produced is only very weakly connected, which we found surprising:

The earliest explicit appearance of the eigenvector-eigenvalue identity we are now aware of is in a 1966 paper of Thompson, although this paper is only cited (directly or indirectly) by a fraction of the known literature, and also there is a precursor identity of Löwner from 1934 that can be shown to imply the identity as a limiting case. At the end of the paper we speculate on some possible reasons why this identity only achieved a modest amount of recognition and dissemination prior to the November 2019 Quanta article.

Archives