You are currently browsing the category archive for the ‘Mathematics’ category.

Now that Google Plus is closing, the brief announcements that I used to post over there will now be migrated over to this blog.  (Some people have suggested other platforms for this also, such as Twitter, but I think for now I can use my existing blog to accommodate these sorts of short posts.)

  1. The NSF-CBMS regional research conferences are now requesting proposals for the 2020 conference series.  (I was the principal lecturer for one of these conferences back in 2005; it was a very intensive experience, but quite enjoyable, and I am quite pleased with the book that resulted from it.)
  2. The awardees for the Sloan Fellowships for 2019 have now been announced.  (I was on the committee for the mathematics awards.  For the usual reasons involving the confidentiality of letters of reference and other sensitive information, I will be unfortunately be unable to answer any specific questions about our committee deliberations.)

I have just uploaded to the arXiv my paper “On the universality of the incompressible Euler equation on compact manifolds, II. Non-rigidity of Euler flows“, submitted to Pure and Applied Functional Analysis. This paper continues my attempts to establish “universality” properties of the Euler equations on Riemannian manifolds {(M,g)}, as I conjecture that the freedom to set the metric {g} ought to allow one to “program” such Euler flows to exhibit a wide range of behaviour, and in particular to achieve finite time blowup (if the dimension is sufficiently large, at least).

In coordinates, the Euler equations read

\displaystyle \partial_t u^k + u^j \nabla_j u^k = - \nabla^k p \ \ \ \ \ (1)


\displaystyle \nabla_k u^k = 0

where {p: [0,T] \rightarrow C^\infty(M)} is the pressure field and {u: [0,T] \rightarrow \Gamma(TM)} is the velocity field, and {\nabla} denotes the Levi-Civita connection with the usual Penrose abstract index notation conventions; we restrict attention here to the case where {u,p} are smooth and {M} is compact, smooth, orientable, connected, and without boundary. Let’s call {u} an Euler flow on {M} (for the time interval {[0,T]}) if it solves the above system of equations for some pressure {p}, and an incompressible flow if it just obeys the divergence-free relation {\nabla_k u^k=0}. Thus every Euler flow is an incompressible flow, but the converse is certainly not true; for instance the various conservation laws of the Euler equation, such as conservation of energy, will already block most incompressible flows from being an Euler flow, or even being approximated in a reasonably strong topology by such Euler flows.

However, one can ask if an incompressible flow can be extended to an Euler flow by adding some additional dimensions to {M}. In my paper, I formalise this by considering warped products {\tilde M} of {M} which (as a smooth manifold) are products {\tilde M = M \times ({\bf R}/{\bf Z})^m} of {M} with a torus, with a metric {\tilde g} given by

\displaystyle d \tilde g^2 = g_{ij}(x) dx^i dx^j + \sum_{s=1}^m \tilde g_{ss}(x) (d\theta^s)^2

for {(x,\theta) \in \tilde M}, where {\theta^1,\dots,\theta^m} are the coordinates of the torus {({\bf R}/{\bf Z})^m}, and {\tilde g_{ss}: M \rightarrow {\bf R}^+} are smooth positive coefficients for {s=1,\dots,m}; in order to preserve the incompressibility condition, we also require the volume preservation property

\displaystyle \prod_{s=1}^m \tilde g_{ss}(x) = 1 \ \ \ \ \ (2)

though in practice we can quickly dispose of this condition by adding one further “dummy” dimension to the torus {({\bf R}/{\bf Z})^m}. We say that an incompressible flow {u} is extendible to an Euler flow if there exists a warped product {\tilde M} extending {M}, and an Euler flow {\tilde u} on {\tilde M} of the form

\displaystyle \tilde u(t,(x,\theta)) = u^i(t,x) \frac{d}{dx^i} + \sum_{s=1}^m \tilde u^s(t,x) \frac{d}{d\theta^s}

for some “swirl” fields {\tilde u^s: [0,T] \times M \rightarrow {\bf R}}. The situation here is motivated by the familiar situation of studying axisymmetric Euler flows {\tilde u} on {{\bf R}^3}, which in cylindrical coordinates take the form

\displaystyle \tilde u(t,(r,z,\theta)) = u^r(t,r,z) \frac{d}{dr} + u^z(t,r,z) \frac{d}{dz} + \tilde u^\theta(t,r,z) \frac{d}{d\theta}.

The base component

\displaystyle u^r(t,r,z) \frac{d}{dr} + u^z(t,r,z) \frac{d}{dz}

of this flow is then a flow on the two-dimensional {r,z} plane which is not quite incompressible (due to the failure of the volume preservation condition (2) in this case) but still satisfies a system of equations (coupled with a passive scalar field {\rho} that is basically the square of the swirl {\tilde u^\rho}) that is reminiscent of the Boussinesq equations.

On a fixed {d}-dimensional manifold {(M,g)}, let {{\mathcal F}} denote the space of incompressible flows {u: [0,T] \rightarrow \Gamma(TM)}, equipped with the smooth topology (in spacetime), and let {{\mathcal E} \subset {\mathcal F}} denote the space of such flows that are extendible to Euler flows. Our main theorem is

Theorem 1

  • (i) (Generic inextendibility) Assume {d \geq 3}. Then {{\mathcal E}} is of the first category in {{\mathcal F}} (the countable union of nowhere dense sets in {{\mathcal F}}).
  • (ii) (Non-rigidity) Assume {M = ({\bf R}/{\bf Z})^d} (with an arbitrary metric {g}). Then {{\mathcal E}} is somewhere dense in {{\mathcal F}} (that is, the closure of {{\mathcal E}} has non-empty interior).

More informally, starting with an incompressible flow {u}, one usually cannot extend it to an Euler flow just by extending the manifold, warping the metric, and adding swirl coefficients, even if one is allowed to select the dimension of the extension, as well as the metric and coefficients, arbitrarily. However, many such flows can be perturbed to be extendible in such a manner (though different perturbations will require different extensions, in particular the dimension of the extension will not be fixed). Among other things, this means that conservation laws such as energy (or momentum, helicity, or circulation) no longer present an obstruction when one is allowed to perform an extension (basically this is because the swirl components of the extension can exchange energy (or momentum, etc.) with the base components in a basically arbitrary fashion.

These results fall short of my hopes to use the ability to extend the manifold to create universal behaviour in Euler flows, because of the fact that each flow requires a different extension in order to achieve the desired dynamics. Still it does seem to provide a little bit of support to the idea that high-dimensional Euler flows are quite “flexible” in their behaviour, though not completely so due to the generic inextendibility phenomenon. This flexibility reminds me a little bit of the flexibility of weak solutions to equations such as the Euler equations provided by the “{h}-principle” of Gromov and its variants (as discussed in these recent notes), although in this case the flexibility comes from adding additional dimensions, rather than by repeatedly adding high-frequency corrections to the solution.

The proof of part (i) of the theorem basically proceeds by a dimension counting argument (similar to that in the proof of Proposition 9 of these recent lecture notes of mine). Heuristically, the point is that an arbitrary incompressible flow {u} is essentially determined by {d-1} independent functions of space and time, whereas the warping factors {\tilde g_{ss}} are functions of space only, the pressure field is one function of space and time, and the swirl fields {u^s} are technically functions of both space and time, but have the same number of degrees of freedom as a function just of space, because they solve an evolution equation. When {d>2}, this means that there are fewer unknown functions of space and time than prescribed functions of space and time, which is the source of the generic inextendibility. This simple argument breaks down when {d=2}, but we do not know whether the claim is actually false in this case.

The proof of part (ii) proceeds by direct calculation of the effect of the warping factors and swirl velocities, which effectively create a forcing term (of Boussinesq type) in the first equation of (1) that is a combination of functions of the Eulerian spatial coordinates {x^i} (coming from the warping factors) and the Lagrangian spatial coordinates {a^\beta} (which arise from the swirl velocities, which are passively transported by the flow). In a non-empty open subset of {{\mathcal F}}, the combination of these coordinates becomes a non-degenerate set of coordinates for spacetime, and one can then use the Stone-Weierstrass theorem to conclude. The requirement that {M} be topologically a torus is a technical hypothesis in order to avoid topological obstructions such as the hairy ball theorem, but it may be that the hypothesis can be dropped (and it may in fact be true, in the {M = ({\bf R}/{\bf Z})^d} case at least, that {{\mathcal E}} is dense in all of {{\mathcal F}}, not just in a non-empty open subset).


[This post is collectively authored by the ICM structure committee, whose membership includes myself, and is listed in full in the post below – T.]

The International Congress of Mathematicians (ICM) is widely considered to be the premier conference for mathematicians.  It is held every four years; for instance, the 2018 ICM was held in Rio de Janeiro, Brazil, and the 2022 ICM is to be held in Saint Petersburg, Russia.  The most high-profile event at the ICM is the awarding of the 10 or so prizes of the International Mathematical Union (IMU) such as the Fields Medal, and the lectures by the prize laureates; but there are also approximately twenty plenary lectures from leading experts across all mathematical disciplines, several public lectures of a less technical nature, about 180 more specialised invited lectures divided into about twenty section panels, each corresponding to a mathematical field (or range of fields), as well as various outreach and social activities, exhibits and satellite programs, and meetings of the IMU General Assembly; see for instance the program for the 2018 ICM for a sample schedule.  In addition to these official events, the ICM also provides more informal networking opportunities, in particular allowing mathematicians at all stages of career, and from all backgrounds and nationalities, to interact with each other.

For each Congress, a Program Committee (together with subcommittees for each section) is entrusted with the task of selecting who will give the lectures of the ICM (excluding the lectures by prize laureates, which are selected by separate prize committees); they also have decided how to appropriately subdivide the entire field of mathematics into sections.   Given the prestigious nature of invitations from the ICM to present a lecture, this has been an important and challenging task, but one for which past Program Committees have managed to fulfill in a largely satisfactory fashion.

Nevertheless, in the last few years there has been substantial discussion regarding ways in which the process for structuring the ICM and inviting lecturers could be further improved, for instance to reflect the fact that the distribution of mathematics across various fields has evolved over time.   At the 2018 ICM General Assembly meeting in Rio de Janeiro, a resolution was adopted to create a new Structure Committee to take on some of the responsibilities previously delegated to the Program Committee, focusing specifically on the structure of the scientific program.  On the other hand, the Structure Committee is not involved with the format for prize lectures, the selection of prize laureates, or the selection of plenary and sectional lecturers; these tasks are instead the responsibilities of other committees (the local Organizing Committee, the prize committees, and the Program Committee respectively).

The first Structure Committee was constituted on 1 Jan 2019, with the following members:

As one of our first actions, we on the committee are using this blog post to solicit input from the mathematical community regarding the topics within our remit.  Among the specific questions (in no particular order) for which we seek comments are the following:

  1. Are there suggestions to change the format of the ICM that would increase its value to the mathematical community?
  2. Are there suggestions to change the format of the ICM that would encourage greater participation and interest in attending, particularly with regards to junior researchers and mathematicians from developing countries?
  3. What is the correct balance between research and exposition in the lectures?  For instance, how strongly should one emphasize the importance of good exposition when selecting plenary and sectional speakers?  Should there be “Bourbaki style” expository talks presenting work not necessarily authored by the speaker?
  4. Is the balance between plenary talks, sectional talks, and public talks at an optimal level?  There is only a finite amount of space in the calendar, so any increase in the number or length of one of these types of talks will come at the expense of another.
  5. The ICM is generally perceived to be more important to pure mathematics than to applied mathematics.  In what ways can the ICM be made more relevant and attractive to applied mathematicians, or should one not try to do so?
  6. Are there structural barriers that cause certain areas or styles of mathematics (such as applied or interdisciplinary mathematics) or certain groups of mathematicians to be under-represented at the ICM?  What, if anything, can be done to mitigate these barriers?

Of course, we do not expect these complex and difficult questions to be resolved within this blog post, and debating these and other issues would likely be a major component of our internal committee discussions.  Nevertheless, we would value constructive comments towards the above questions (or on other topics within the scope of our committee) to help inform these subsequent discussions.  We therefore welcome and invite such commentary, either as responses to this blog post, or sent privately to one of the members of our committee.  We would also be interested in having readers share their personal experiences at past congresses, and how it compares with other major conferences of this type.   (But in order to keep the discussion focused and constructive, we request that comments here refrain from discussing topics that are out of the scope of this committee, such as suggesting specific potential speakers for the next congress, which is a task instead for the 2022 ICM Program Committee.)

While talking mathematics with a postdoc here at UCLA (March Boedihardjo) we came across the following matrix problem which we managed to solve, but the proof was cute and the process of discovering it was fun, so I thought I would present the problem here as a puzzle without revealing the solution for now.

The problem involves word maps on a matrix group, which for sake of discussion we will take to be the special orthogonal group SO(3) of real 3 \times 3 matrices (one of the smallest matrix groups that contains a copy of the free group, which incidentally is the key observation powering the Banach-Tarski paradox).  Given any abstract word w of two generators x,y and their inverses (i.e., an element of the free group {\bf F}_2), one can define the word map w: SO(3) \times SO(3) \to SO(3) simply by substituting a pair of matrices in SO(3) into these generators.  For instance, if one has the word w = x y x^{-2} y^2 x, then the corresponding word map w: SO(3) \times SO(3) \to SO(3) is given by

\displaystyle w(A,B) := ABA^{-2} B^2 A

for A,B \in SO(3).  Because SO(3) contains a copy of the free group, we see the word map is non-trivial (not equal to the identity) if and only if the word itself is nontrivial.

Anyway, here is the problem:

Problem. Does there exist a sequence w_1, w_2, \dots of non-trivial word maps w_n: SO(3) \times SO(3) \to SO(3) that converge uniformly to the identity map?

To put it another way, given any \varepsilon > 0, does there exist a non-trivial word w such that \|w(A,B) - 1 \| \leq \varepsilon for all A,B \in SO(3), where \| \| denotes (say) the operator norm, and 1 denotes the identity matrix in SO(3)?

As I said, I don’t want to spoil the fun of working out this problem, so I will leave it as a challenge. Readers are welcome to share their thoughts, partial solutions, or full solutions in the comments below.

We consider the incompressible Euler equations on the (Eulerian) torus {\mathbf{T}_E := ({\bf R}/{\bf Z})^d}, which we write in divergence form as

\displaystyle  \partial_t u^i + \partial_j(u^j u^i) = - \eta^{ij} \partial_j p \ \ \ \ \ (1)

\displaystyle  \partial_i u^i = 0, \ \ \ \ \ (2)

where {\eta^{ij}} is the (inverse) Euclidean metric. Here we use the summation conventions for indices such as {i,j,l} (reserving the symbol {k} for other purposes), and are retaining the convention from Notes 1 of denoting vector fields using superscripted indices rather than subscripted indices, as we will eventually need to change variables to Lagrangian coordinates at some point. In principle, much of the discussion in this set of notes (particularly regarding the positive direction of Onsager’s conjecture) could also be modified to also treat non-periodic solutions that decay at infinity if desired, but some non-trivial technical issues do arise non-periodic settings for the negative direction.

As noted previously, the kinetic energy

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} |u(t,x)|^2\ dx = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(t,x) u^j(t,x)\ dx

is formally conserved by the flow, where {\eta_{ij}} is the Euclidean metric. Indeed, if one assumes that {u,p} are continuously differentiable in both space and time on {[0,T] \times \mathbf{T}}, then one can multiply the equation (1) by {u^l} and contract against {\eta_{il}} to obtain

\displaystyle  \eta_{il} u^l \partial_t u^i + \eta_{il} u^l \partial_j (u^j u^i) = - \eta_{il} u^l \eta^{ij} \partial_j p = 0

which rearranges using (2) and the product rule to

\displaystyle  \partial_t (\frac{1}{2} \eta_{ij} u^i u^j) + \partial_j( \frac{1}{2} \eta_{il} u^i u^j u^l ) + \partial_j (u^j p)

and then if one integrates this identity on {[0,T] \times \mathbf{T}_E} and uses Stokes’ theorem, one obtains the required energy conservation law

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(T,x) u^j(T,x)\ dx = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(0,x) u^j(0,x)\ dx. \ \ \ \ \ (3)

It is then natural to ask whether the energy conservation law continues to hold for lower regularity solutions, in particular weak solutions that only obey (1), (2) in a distributional sense. The above argument no longer works as stated, because {u^i} is not a test function and so one cannot immediately integrate (1) against {u^i}. And indeed, as we shall soon see, it is now known that once the regularity of {u} is low enough, energy can “escape to frequency infinity”, leading to failure of the energy conservation law, a phenomenon known in physics as anomalous energy dissipation.

But what is the precise level of regularity needed in order to for this anomalous energy dissipation to occur? To make this question precise, we need a quantitative notion of regularity. One such measure is given by the Hölder space {C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} for {0 < \alpha < 1}, defined as the space of continuous functions {f: \mathbf{T}_E \rightarrow {\bf R}} whose norm

\displaystyle  \| f \|_{C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} := \sup_{x \in \mathbf{T}_E} |f(x)| + \sup_{x,y \in \mathbf{T}_E: x \neq y} \frac{|f(x)-f(y)|}{|x-y|^\alpha}

is finite. The space {C^{0,\alpha}} lies between the space {C^0} of continuous functions and the space {C^1} of continuously differentiable functions, and informally describes a space of functions that is “{\alpha} times differentiable” in some sense. The above derivation of the energy conservation law involved the integral

\displaystyle  \int_{\mathbf{T}_E} \eta_{ik} u^k \partial_j (u^j u^i)\ dx

that roughly speaking measures the fluctuation in energy. Informally, if we could take the derivative in this integrand and somehow “integrate by parts” to split the derivative “equally” amongst the three factors, one would morally arrive at an expression that resembles

\displaystyle  \int_{\mathbf{T}} \nabla^{1/3} u \nabla^{1/3} u \nabla^{1/3} u\ dx

which suggests that the integral can be made sense of for {u \in C^0_t C^{0,\alpha}_x} once {\alpha > 1/3}. More precisely, one can make

Conjecture 1 (Onsager’s conjecture) Let {0 < \alpha < 1} and {d \geq 2}, and let {0 < T < \infty}.

  • (i) If {\alpha > 1/3}, then any weak solution {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} to the Euler equations (in the Leray form {\partial_t u + \partial_j {\mathbb P} (u^j u) = u_0(x) \delta_0(t)}) obeys the energy conservation law (3).
  • (ii) If {\alpha \leq 1/3}, then there exist weak solutions {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} to the Euler equations (in Leray form) which do not obey energy conservation.

This conjecture was originally arrived at by Onsager by a somewhat different heuristic derivation; see Remark 7. The numerology is also compatible with that arising from the Kolmogorov theory of turbulence (discussed in this previous post), but we will not discuss this interesting connection further here.

The positive part (i) of Onsager conjecture was established by Constantin, E, and Titi, building upon earlier partial results by Eyink; the proof is a relatively straightforward application of Littlewood-Paley theory, and they were also able to work in larger function spaces than {C^0_t C^{0,\alpha}_x} (using {L^3_x}-based Besov spaces instead of Hölder spaces, see Exercise 3 below). The negative part (ii) is harder. Discontinuous weak solutions to the Euler equations that did not conserve energy were first constructed by Sheffer, with an alternate construction later given by Shnirelman. De Lellis and Szekelyhidi noticed the resemblance of this problem to that of the Nash-Kuiper theorem in the isometric embedding problem, and began adapting the convex integration technique used in that theorem to construct weak solutions of the Euler equations. This began a long series of papers in which increasingly regular weak solutions that failed to conserve energy were constructed, culminating in a recent paper of Isett establishing part (ii) of the Onsager conjecture in the non-endpoint case {\alpha < 1/3} in three and higher dimensions {d \geq 3}; the endpoint {\alpha = 1/3} remains open. (In two dimensions it may be the case that the positive results extend to a larger range than Onsager’s conjecture predicts; see this paper of Cheskidov, Lopes Filho, Nussenzveig Lopes, and Shvydkoy for more discussion.) Further work continues into several variations of the Onsager conjecture, in which one looks at other differential equations, other function spaces, or other criteria for bad behavior than breakdown of energy conservation. See this recent survey of de Lellis and Szekelyhidi for more discussion.

In these notes we will first establish (i), then discuss the convex integration method in the original context of the Nash-Kuiper embedding theorem. Before tackling the Onsager conjecture (ii) directly, we discuss a related construction of high-dimensional weak solutions in the Sobolev space {L^2_t H^s_x} for {s} close to {1/2}, which is slightly easier to establish, though still rather intricate. Finally, we discuss the modifications of that construction needed to establish (ii), though we shall stop short of a full proof of that part of the conjecture.

We thank Phil Isett for some comments and corrections.

Read the rest of this entry »

This is the eleventh research thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

There are currently two strands of activity.  One is writing up the paper describing the combination of theoretical and numerical results needed to obtain the new bound \Lambda \leq 0.22.  The latest version of the writeup may be found here, in this directory.  The theoretical side of things have mostly been written up; the main remaining tasks to do right now are

  1. giving a more detailed description and illustration of the two major numerical verifications, namely the barrier verification that establishes a zero-free region for H_t(x+iy)=0 for 0 \leq t \leq 0.2, 0.2 \leq y \leq 1, |x - 6 \times 10^{10} - 83952| \leq 0.5, and the Dirichlet series bound that establishes a zero-free region for t = 0.2, 0.2 \leq y \leq 1, x \geq 6 \times 10^{10} + 83952; and
  2. giving more detail on the conditional results assuming more numerical verification of RH.

Meanwhile, several of us have been exploring the behaviour of the zeroes of H_t for negative t; this does not directly lead to any new progress on bounding \Lambda (though there is a good chance that it may simplify the proof of \Lambda \geq 0), but there have been some interesting numerical phenomena uncovered, as summarised in this set of slides.  One phenomenon is that for large negative t, many of the complex zeroes begin to organise themselves near the curves

\displaystyle y = -\frac{t}{2} \log \frac{x}{4\pi n(n+1)} - 1.

(An example of the agreement between the zeroes and these curves may be found here.)  We now have a (heuristic) theoretical explanation for this; we should have an approximation

\displaystyle H_t(x+iy) \approx B_t(x+iy) \sum_{n=1}^\infty \frac{b_n^t}{n^{s_*}}

in this region (where B_t, b_n^t, n^{s_*} are defined in equations (11), (15), (17) of the writeup, and the above curves arise from (an approximation of) those locations where two adjacent terms \frac{b_n^t}{n^{s_*}}, \frac{b_{n+1}^t}{(n+1)^{s_*}} in this series have equal magnitude (with the other terms being of lower order).

However, we only have a partial explanation at present of the interesting behaviour of the real zeroes at negative t, for instance the surviving zeroes at extremely negative values of t appear to lie on the curve where the quantity N is close to a half-integer, where

\displaystyle \tilde x := x + \frac{\pi t}{4}

\displaystyle N := \sqrt{\frac{\tilde x}{4\pi}}

The remaining zeroes exhibit a pattern in (N,u) coordinates that is approximately 1-periodic in N, where

\displaystyle u := \frac{4\pi |t|}{\tilde x}.

A plot of the zeroes in these coordinates (somewhat truncated due to the numerical range) may be found here.

We do not yet have a total explanation of the phenomena seen in this picture.  It appears that we have an approximation

\displaystyle H_t(x) \approx A_t(x) \sum_{n=1}^\infty \exp( -\frac{|t| \log^2(n/N)}{4(1-\frac{iu}{8\pi})} - \frac{1+i\tilde x}{2} \log(n/N) )

where A_t(x) is the non-zero multiplier

\displaystyle A_t(x) := e^{\pi^2 t/64} M_0(\frac{1+i\tilde x}{2}) N^{-\frac{1+i\tilde x}{2}} \sqrt{\frac{\pi}{1-\frac{iu}{8\pi}}}


\displaystyle M_0(s) := \frac{1}{8}\frac{s(s-1)}{2}\pi^{-s/2} \sqrt{2\pi} \exp( (\frac{s}{2}-\frac{1}{2}) \log \frac{s}{2} - \frac{s}{2} )

The derivation of this formula may be found in this wiki page.  However our initial attempts to simplify the above approximation further have proven to be somewhat inaccurate numerically (in particular giving an incorrect prediction for the location of zeroes, as seen in this picture).  We are in the process of using numerics to try to resolve the discrepancies (see this page for some code and discussion).


These lecture notes are a continuation of the 254A lecture notes from the previous quarter.

We consider the Euler equations for incompressible fluid flow on a Euclidean space {{\bf R}^d}; we will label {{\bf R}^d} as the “Eulerian space” {{\bf R}^d_E} (or “Euclidean space”, or “physical space”) to distinguish it from the “Lagrangian space” {{\bf R}^d_L} (or “labels space”) that we will introduce shortly (but the reader is free to also ignore the {E} or {L} subscripts if he or she wishes). Elements of Eulerian space {{\bf R}^d_E} will be referred to by symbols such as {x}, we use {dx} to denote Lebesgue measure on {{\bf R}^d_E} and we will use {x^1,\dots,x^d} for the {d} coordinates of {x}, and use indices such as {i,j,k} to index these coordinates (with the usual summation conventions), for instance {\partial_i} denotes partial differentiation along the {x^i} coordinate. (We use superscripts for coordinates {x^i} instead of subscripts {x_i} to be compatible with some differential geometry notation that we will use shortly; in particular, when using the summation notation, we will now be matching subscripts with superscripts for the pair of indices being summed.)

In Eulerian coordinates, the Euler equations read

\displaystyle  \partial_t u + u \cdot \nabla u = - \nabla p \ \ \ \ \ (1)

\displaystyle  \nabla \cdot u = 0

where {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} is the velocity field and {p: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} is the pressure field. These are functions of time {t \in [0,T)} and on the spatial location variable {x \in {\bf R}^d_E}. We will refer to the coordinates {(t,x) = (t,x^1,\dots,x^d)} as Eulerian coordinates. However, if one reviews the physical derivation of the Euler equations from 254A Notes 0, before one takes the continuum limit, the fundamental unknowns were not the velocity field {u} or the pressure field {p}, but rather the trajectories {(x^{(a)}(t))_{a \in A}}, which can be thought of as a single function {x: [0,T) \times A \rightarrow {\bf R}^d_E} from the coordinates {(t,a)} (where {t} is a time and {a} is an element of the label set {A}) to {{\bf R}^d}. The relationship between the trajectories {x^{(a)}(t) = x(t,a)} and the velocity field was given by the informal relationship

\displaystyle  \partial_t x(t,a) \approx u( t, x(t,a) ). \ \ \ \ \ (2)

We will refer to the coordinates {(t,a)} as (discrete) Lagrangian coordinates for describing the fluid.

In view of this, it is natural to ask whether there is an alternate way to formulate the continuum limit of incompressible inviscid fluids, by using a continuous version {(t,a)} of the Lagrangian coordinates, rather than Eulerian coordinates. This is indeed the case. Suppose for instance one has a smooth solution {u, p} to the Euler equations on a spacetime slab {[0,T) \times {\bf R}^d_E} in Eulerian coordinates; assume furthermore that the velocity field {u} is uniformly bounded. We introduce another copy {{\bf R}^d_L} of {{\bf R}^d}, which we call Lagrangian space or labels space; we use symbols such as {a} to refer to elements of this space, {da} to denote Lebesgue measure on {{\bf R}^d_L}, and {a^1,\dots,a^d} to refer to the {d} coordinates of {a}. We use indices such as {\alpha,\beta,\gamma} to index these coordinates, thus for instance {\partial_\alpha} denotes partial differentiation along the {a^\alpha} coordinate. We will use summation conventions for both the Eulerian coordinates {i,j,k} and the Lagrangian coordinates {\alpha,\beta,\gamma}, with an index being summed if it appears as both a subscript and a superscript in the same term. While {{\bf R}^d_L} and {{\bf R}^d_E} are of course isomorphic, we will try to refrain from identifying them, except perhaps at the initial time {t=0} in order to fix the initialisation of Lagrangian coordinates.

Given a smooth and bounded velocity field {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E}, define a trajectory map for this velocity to be any smooth map {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} that obeys the ODE

\displaystyle  \partial_t X(t,a) = u( t, X(t,a) ); \ \ \ \ \ (3)

in view of (2), this describes the trajectory (in {{\bf R}^d_E}) of a particle labeled by an element {a} of {{\bf R}^d_L}. From the Picard existence theorem and the hypothesis that {u} is smooth and bounded, such a map exists and is unique as long as one specifies the initial location {X(0,a)} assigned to each label {a}. Traditionally, one chooses the initial condition

\displaystyle  X(0,a) = a \ \ \ \ \ (4)

for {a \in {\bf R}^d_L}, so that we label each particle by its initial location at time {t=0}; we are also free to specify other initial conditions for the trajectory map if we please. Indeed, we have the freedom to “permute” the labels {a \in {\bf R}^d_L} by an arbitrary diffeomorphism: if {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} is a trajectory map, and {\pi: {\bf R}^d_L \rightarrow{\bf R}^d_L} is any diffeomorphism (a smooth map whose inverse exists and is also smooth), then the map {X \circ \pi: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} is also a trajectory map, albeit one with different initial conditions {X(0,a)}.

Despite the popularity of the initial condition (4), we will try to keep conceptually separate the Eulerian space {{\bf R}^d_E} from the Lagrangian space {{\bf R}^d_L}, as they play different physical roles in the interpretation of the fluid; for instance, while the Euclidean metric {d\eta^2 = dx^1 dx^1 + \dots + dx^d dx^d} is an important feature of Eulerian space {{\bf R}^d_E}, it is not a geometrically natural structure to use in Lagrangian space {{\bf R}^d_L}. We have the following more general version of Exercise 8 from 254A Notes 2:

Exercise 1 Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be smooth and bounded.

  • If {X_0: {\bf R}^d_L \rightarrow {\bf R}^d_E} is a smooth map, show that there exists a unique smooth trajectory map {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} with initial condition {X(0,a) = X_0(a)} for all {a \in {\bf R}^d_L}.
  • Show that if {X_0} is a diffeomorphism and {t \in [0,T)}, then the map {X(t): a \mapsto X(t,a)} is also a diffeomorphism.

Remark 2 The first of the Euler equations (1) can now be written in the form

\displaystyle  \frac{d^2}{dt^2} X(t,a) = - (\nabla p)( t, X(t,a) ) \ \ \ \ \ (5)

which can be viewed as a continuous limit of Newton’s first law {m^{(a)} \frac{d^2}{dt^2} x^{(a)}(t) = F^{(a)}(t)}.

Call a diffeomorphism {Y: {\bf R}^d_L \rightarrow {\bf R}^d_E} (oriented) volume preserving if one has the equation

\displaystyle  \mathrm{det}( \nabla Y )(a) = 1 \ \ \ \ \ (6)

for all {a \in {\bf R}^d_L}, where the total differential {\nabla Y} is the {d \times d} matrix with entries {\partial_\alpha Y^i} for {\alpha = 1,\dots,d} and {i=1,\dots,d}, where {Y^1,\dots,Y^d:{\bf R}^d_L \rightarrow {\bf R}} are the components of {Y}. (If one wishes, one can also view {\nabla Y} as a linear transformation from the tangent space {T_a {\bf R}^d_L} of Lagrangian space at {a} to the tangent space {T_{Y(a)} {\bf R}^d_E} of Eulerian space at {Y(a)}.) Equivalently, {Y} is orientation preserving and one has a Jacobian-free change of variables formula

\displaystyle  \int_{{\bf R}^d_F} f( Y(a) )\ da = \int_{{\bf R}^d_E} f(x)\ dx

for all {f \in C_c({\bf R}^d_E \rightarrow {\bf R})}, which is in turn equivalent to {Y(E) \subset {\bf R}^d_E} having the same Lebesgue measure as {E} for any measurable set {E \subset {\bf R}^d_L}.

The divergence-free condition {\nabla \cdot u = 0} then can be nicely expressed in terms of volume-preserving properties of the trajectory maps {X}, in a manner which confirms the interpretation of this condition as an incompressibility condition on the fluid:

Lemma 3 Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be smooth and bounded, let {X_0: {\bf R}^d_L \rightarrow {\bf R}^d_E} be a volume-preserving diffeomorphism, and let {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} be the trajectory map. Then the following are equivalent:

  • {\nabla \cdot u = 0} on {[0,T) \times {\bf R}^d_E}.
  • {X(t): {\bf R}^d_L \rightarrow {\bf R}^d_E} is volume-preserving for all {t \in [0,T)}.

Proof: Since {X_0} is orientation-preserving, we see from continuity that {X(t)} is also orientation-preserving. Suppose that {X(t)} is also volume-preserving, then for any {f \in C^\infty_c({\bf R}^d_E \rightarrow {\bf R})} we have the conservation law

\displaystyle  \int_{{\bf R}^d_L} f( X(t,a) )\ da = \int_{{\bf R}^d_E} f(x)\ dx

for all {t \in [0,T)}. Differentiating in time using the chain rule and (3) we conclude that

\displaystyle  \int_{{\bf R}^d_L} (u(t) \cdot \nabla f)( X(t,a)) \ da = 0

for all {t \in [0,T)}, and hence by change of variables

\displaystyle  \int_{{\bf R}^d_E} (u(t) \cdot \nabla f)(x) \ dx = 0

which by integration by parts gives

\displaystyle  \int_{{\bf R}^d_E} (\nabla \cdot u(t,x)) f(x)\ dx = 0

for all {f \in C^\infty_c({\bf R}^d_E \rightarrow {\bf R})} and {t \in [0,T)}, so {u} is divergence-free.

To prove the converse implication, it is convenient to introduce the labels map {A:[0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_L}, defined by setting {A(t): {\bf R}^d_E \rightarrow {\bf R}^d_L} to be the inverse of the diffeomorphism {X(t): {\bf R}^d_L \rightarrow {\bf R}^d_E}, thus

\displaystyle A(t, X(t,a)) = a

for all {(t,a) \in [0,T) \times {\bf R}^d_L}. By the implicit function theorem, {A} is smooth, and by differentiating the above equation in time using (3) we see that

\displaystyle  D_t A(t,x) = 0

where {D_t} is the usual material derivative

\displaystyle  D_t := \partial_t + u \cdot \nabla \ \ \ \ \ (7)

acting on functions on {[0,T) \times {\bf R}^d_E}. If {u} is divergence-free, we have from integration by parts that

\displaystyle  \partial_t \int_{{\bf R}^d_E} \phi(t,x)\ dx = \int_{{\bf R}^d_E} D_t \phi(t,x)\ dx

for any test function {\phi: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}}. In particular, for any {g \in C^\infty_c({\bf R}^d_L \rightarrow {\bf R})}, we can calculate

\displaystyle \partial_t \int_{{\bf R}^d_E} g( A(t,x) )\ dx = \int_{{\bf R}^d_E} D_t (g(A(t,x)))\ dx

\displaystyle  = \int_{{\bf R}^d_E} 0\ dx

and hence

\displaystyle  \int_{{\bf R}^d_E} g(A(t,x))\ dx = \int_{{\bf R}^d_E} g(A(0,x))\ dx

for any {t \in [0,T)}. Since {X_0} is volume-preserving, so is {A(0)}, thus

\displaystyle  \int_{{\bf R}^d_E} g \circ A(t)\ dx = \int_{{\bf R}^d_L} g\ da.

Thus {A(t)} is volume-preserving, and hence {X(t)} is also. \Box

Exercise 4 Let {M: [0,T) \rightarrow \mathrm{GL}_d({\bf R})} be a continuously differentiable map from the time interval {[0,T)} to the general linear group {\mathrm{GL}_d({\bf R})} of invertible {d \times d} matrices. Establish Jacobi’s formula

\displaystyle  \partial_t \det(M(t)) = \det(M(t)) \mathrm{tr}( M(t)^{-1} \partial_t M(t) )

and use this and (6) to give an alternate proof of Lemma 3 that does not involve any integration in space.

Remark 5 One can view the use of Lagrangian coordinates as an extension of the method of characteristics. Indeed, from the chain rule we see that for any smooth function {f: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} of Eulerian spacetime, one has

\displaystyle  \frac{d}{dt} f(t,X(t,a)) = (D_t f)(t,X(t,a))

and hence any transport equation that in Eulerian coordinates takes the form

\displaystyle  D_t f = g

for smooth functions {f,g: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} of Eulerian spacetime is equivalent to the ODE

\displaystyle  \frac{d}{dt} F = G

where {F,G: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}} are the smooth functions of Lagrangian spacetime defined by

\displaystyle  F(t,a) := f(t,X(t,a)); \quad G(t,a) := g(t,X(t,a)).

In this set of notes we recall some basic differential geometry notation, particularly with regards to pullbacks and Lie derivatives of differential forms and other tensor fields on manifolds such as {{\bf R}^d_E} and {{\bf R}^d_L}, and explore how the Euler equations look in this notation. Our discussion will be entirely formal in nature; we will assume that all functions have enough smoothness and decay at infinity to justify the relevant calculations. (It is possible to work rigorously in Lagrangian coordinates – see for instance the work of Ebin and Marsden – but we will not do so here.) As a general rule, Lagrangian coordinates tend to be somewhat less convenient to use than Eulerian coordinates for establishing the basic analytic properties of the Euler equations, such as local existence, uniqueness, and continuous dependence on the data; however, they are quite good at clarifying the more algebraic properties of these equations, such as conservation laws and the variational nature of the equations. It may well be that in the future we will be able to use the Lagrangian formalism more effectively on the analytic side of the subject also.

Remark 6 One can also write the Navier-Stokes equations in Lagrangian coordinates, but the equations are not expressed in a favourable form in these coordinates, as the Laplacian {\Delta} appearing in the viscosity term becomes replaced with a time-varying Laplace-Beltrami operator. As such, we will not discuss the Lagrangian coordinate formulation of Navier-Stokes here.

Read the rest of this entry »

Note: this post is not required reading for this course, or for the sequel course in the winter quarter.

In a Notes 2, we reviewed the classical construction of Leray of global weak solutions to the Navier-Stokes equations. We did not quite follow Leray’s original proof, in that the notes relied more heavily on the machinery of Littlewood-Paley projections, which have become increasingly common tools in modern PDE. On the other hand, we did use the same “exploiting compactness to pass to weakly convergent subsequence” strategy that is the standard one in the PDE literature used to construct weak solutions.

As I discussed in a previous post, the manipulation of sequences and their limits is analogous to a “cheap” version of nonstandard analysis in which one uses the Fréchet filter rather than an ultrafilter to construct the nonstandard universe. (The manipulation of generalised functions of Columbeau-type can also be comfortably interpreted within this sort of cheap nonstandard analysis.) Augmenting the manipulation of sequences with the right to pass to subsequences whenever convenient is then analogous to a sort of “lazy” nonstandard analysis, in which the implied ultrafilter is never actually constructed as a “completed object“, but is instead lazily evaluated, in the sense that whenever membership of a given subsequence of the natural numbers in the ultrafilter needs to be determined, one either passes to that subsequence (thus placing it in the ultrafilter) or the complement of the sequence (placing it out of the ultrafilter). This process can be viewed as the initial portion of the transfinite induction that one usually uses to construct ultrafilters (as discussed using a voting metaphor in this post), except that there is generally no need in any given application to perform the induction for any uncountable ordinal (or indeed for most of the countable ordinals also).

On the other hand, it is also possible to work directly in the orthodox framework of nonstandard analysis when constructing weak solutions. This leads to an approach to the subject which is largely equivalent to the usual subsequence-based approach, though there are some minor technical differences (for instance, the subsequence approach occasionally requires one to work with separable function spaces, whereas in the ultrafilter approach the reliance on separability is largely eliminated, particularly if one imposes a strong notion of saturation on the nonstandard universe). The subject acquires a more “algebraic” flavour, as the quintessential analysis operation of taking a limit is replaced with the “standard part” operation, which is an algebra homomorphism. The notion of a sequence is replaced by the distinction between standard and nonstandard objects, and the need to pass to subsequences disappears entirely. Also, the distinction between “bounded sequences” and “convergent sequences” is largely eradicated, particularly when the space that the sequences ranged in enjoys some compactness properties on bounded sets. Also, in this framework, the notorious non-uniqueness features of weak solutions can be “blamed” on the non-uniqueness of the nonstandard extension of the standard universe (as well as on the multiple possible ways to construct nonstandard mollifications of the original standard PDE). However, many of these changes are largely cosmetic; switching from a subsequence-based theory to a nonstandard analysis-based theory does not seem to bring one significantly closer for instance to the global regularity problem for Navier-Stokes, but it could have been an alternate path for the historical development and presentation of the subject.

In any case, I would like to present below the fold this nonstandard analysis perspective, quickly translating the relevant components of real analysis, functional analysis, and distributional theory that we need to this perspective, and then use it to re-prove Leray’s theorem on existence of global weak solutions to Navier-Stokes.

Read the rest of this entry »

Kaisa Matomäki, Maksym Radziwill, and I just uploaded to the arXiv our paper “Fourier uniformity of bounded multiplicative functions in short intervals on average“. This paper is the outcome of our attempts during the MSRI program in analytic number theory last year to attack the local Fourier uniformity conjecture for the Liouville function {\lambda}. This conjecture generalises a landmark result of Matomäki and Radziwill, who show (among other things) that one has the asymptotic

\displaystyle  \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n)|\ dx = o(HX) \ \ \ \ \ (1)

whenever {X \rightarrow \infty} and {H = H(X)} goes to infinity as {X \rightarrow \infty}. Informally, this says that the Liouville function has small mean for almost all short intervals {[x,x+H]}. The remarkable thing about this theorem is that there is no lower bound on how {H} goes to infinity with {X}; one can take for instance {H = \log\log\log X}. This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that

\displaystyle  \sum_{n \leq X} \frac{\lambda(n) \lambda(n+1)}{n} = o(\log X).

The local Fourier uniformity conjecture asserts the stronger asymptotic

\displaystyle  \int_X^{2X} \sup_{\alpha \in {\bf R}} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|\ dx = o(HX) \ \ \ \ \ (2)

under the same hypotheses on {H} and {X}. As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that

\displaystyle  \sum_{n \leq X} \frac{\lambda(n) \lambda(n+1) \lambda(n+2)}{n} = o(\log X).

This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that {\lambda(p)=-1} for all primes {p}).

There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase {{}e(-\alpha n)} is replaced with a polynomial phase such as {e(-\alpha_d n^d - \dots - \alpha_1 n - \alpha_0)}, or more generally a nilsequence {\overline{F(g(n) \Gamma)}}; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.

The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:

Theorem 1 The asymptotic (2) is true when {H = X^\theta} for a fixed {\theta > 0}.

Previously this was known for {\theta > 5/8} by the work of Zhan (who in fact proved the stronger pointwise assertion {\sup_{\alpha \in {\bf R}} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|= o(H)} for {X \leq x \leq 2X} in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version

\displaystyle  \sup_{\alpha \in {\bf R}} \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|\ dx = o(HX) \ \ \ \ \ (3)

of (2) for any {H} growing arbitrarily slowly with {X}; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case {\alpha} is permitted to depend on the {x} parameter, whereas in (3) {\alpha} must remain independent of {x}.

Unfortunately, the restriction {H = X^\theta} is not strong enough to give applications to Chowla-type conjectures (one would need something more like {H = \log^\theta X} for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic

\displaystyle  \sum_{h \leq H} \sum_{n \leq X} \lambda(n) \Lambda(n+h) \Lambda(n+2h) = o( H X )

whenever {H = X^\theta} for a fixed {\theta > 0}, where {\Lambda} is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for

\displaystyle  \sum_{h \leq H} \sum_{n \leq X} \Lambda(n+h) \Lambda(n+2h)

is only known in the range {\theta \geq 1/6} (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes easier to control when one inserts a Liouville function!

We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of {x \in [X,2X]} we would have the lower bound

\displaystyle  |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha_x n)| \gg 1

for some frequency {\alpha_x \in{\bf R}}. We informally describe this correlation between {\lambda(n)} and {e(\alpha_x n)} by writing

\displaystyle  \lambda(n) \approx e(\alpha_x n) \ \ \ \ \ (4)

for {n \in [x,x+H]} (informally, one should view this as asserting that {\lambda(n)} “behaves like” a constant multiple of {e(\alpha_x n)}). For sake of discussion, suppose we have this relationship for all {x \in [X,2X]}, not just many.

As mentioned before, the main difficulty here is to understand how {\alpha_x} varies with {x}. As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let {p} be a fairly small prime (e.g. of size {H^\varepsilon} for some {\varepsilon>0}), and use the identity {\lambda(np) = \lambda(n) \lambda(p) = - \lambda(n)} for the Liouville function to conclude (at least heuristically) from (4) that

\displaystyle  \lambda(n) \approx e(\alpha_x n p)

for {n \in [x/p, x/p + H/p]}. (In practice, we will have this sort of claim for many primes {p} rather than all primes {p}, after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)

Now let {x, y \in [X,2X]} and {p,q \sim P} be primes comparable to some fixed range {P = H^\varepsilon} such that

\displaystyle  x/p = y/q + O( H/P). \ \ \ \ \ (5)

Then we have both

\displaystyle  \lambda(n) \approx e(\alpha_x n p)


\displaystyle  \lambda(n) \approx e(\alpha_y n q)

on essentially the same range of {n} (two nearby intervals of length {\sim H/P}). This suggests that the frequencies {p \alpha_x} and {q \alpha_y} should be close to each other modulo {1}, in particular one should expect the relationship

\displaystyle  p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } 1. \ \ \ \ \ (6)

Comparing this with (5) one is led to the expectation that {\alpha_x} should depend inversely on {x} in some sense (for instance one can check that

\displaystyle  \alpha_x = T/x \ \ \ \ \ (7)

would solve (6) if {T = O( X / H^2 )}; by Taylor expansion, this would correspond to a global approximation of the form {\lambda(n) \approx n^{iT}}). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on {\alpha_x} that resembles (7).

A key obstacle in solving (6) efficiently is the fact that one only knows that {p \alpha_x} and {q \alpha_y} are close modulo {1}, rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say) {\alpha_y} by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship

\displaystyle  p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } p \ \ \ \ \ (8)

whenever {x,y \in [X,2X]} and {p,q \sim P} obey (5). (This may force {\alpha_q} to become extremely large, on the order of {\prod_{p \sim P} p}, but this will not concern us.)

Now suppose that we have {y,y' \in [X,2X]} and primes {q,q' \sim P} such that

\displaystyle  y/q = y'/q' + O(H/P). \ \ \ \ \ (9)

For every prime {p \sim P}, we can find an {x} such that {x/p} is within {O(H/P)} of both {y/q} and {y'/q'}. Applying (8) twice we obtain

\displaystyle  p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } p


\displaystyle  p \alpha_x = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } p

and thus by the triangle inequality we have

\displaystyle  q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } p

for all {p \sim P}; hence by the Chinese remainder theorem

\displaystyle  q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } \prod_{p \sim P} p.

In practice, in the regime {H = X^\theta} that we are considering, the modulus {\prod_{p \sim P} p} is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have

\displaystyle  q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \ \ \ \ \ (10)

whenever {y,y' \in [X,2X]} and {q,q' \sim P} obey (9).

Now let {k} be an integer to be chosen later, and suppose we have primes {p_1,\dots,p_k,q_1,\dots,q_k \sim P} such that the difference

\displaystyle  q = |p_1 \dots p_k - q_1 \dots q_k|

is small but non-zero. If {k} is chosen so that

\displaystyle  P^k \approx \frac{X}{H}

(where one is somewhat loose about what {\approx} means) then one can then find real numbers {x_1,\dots,x_k \sim X} such that

\displaystyle  \frac{x_j}{p_j} = \frac{x_{j+1}}{q_j} + O( \frac{H}{P} )

for {j=1,\dots,k}, with the convention that {x_{k+1} = x_1}. We then have

\displaystyle  p_j \alpha_{x_j} = q_j \alpha_{x_{j+1}} + O( \frac{P}{H} )

which telescopes to

\displaystyle  p_1 \dots p_k \alpha_{x_1} = q_1 \dots q_k \alpha_{x_1} + O( \frac{P^k}{H} )

and thus

\displaystyle  q \alpha_{x_1} = O( \frac{P^k}{H} )

and hence

\displaystyle  \alpha_{x_1} = O( \frac{P^k}{H} ) \approx O( \frac{X}{H^2} ).

In particular, for each {x \sim X}, we expect to be able to write

\displaystyle  \alpha_x = \frac{T_x}{x} + O( \frac{1}{H} )

for some {T_x = O( \frac{X^2}{H^2} )}. This quantity {T_x} can vary with {x}; but from (10) and a short calculation we see that

\displaystyle  T_y = T_{y'} + O( \frac{X}{H} )

whenever {y, y' \in [X,2X]} obey (9) for some {q,q' \sim P}.

Now imagine a “graph” in which the vertices are elements {y} of {[X,2X]}, and two elements {y,y'} are joined by an edge if (9) holds for some {q,q' \sim P}. Because of exponential sum estimates on {\sum_{q \sim P} q^{it}}, this graph turns out to essentially be an “expander” in the sense that any two vertices {y,y' \in [X,2X]} can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of {y} or {y'} by {O(H)}). As a consequence, we can assume that this quantity {T_y} is essentially constant in {y} (cf. the application of the ergodic theorem in this previous blog post), thus we now have

\displaystyle  \alpha_x = \frac{T}{x} + O(\frac{1}{H} )

for most {x \in [X,2X]} and some {T = O(X^2/H^2)}. By Taylor expansion, this implies that

\displaystyle  \lambda(n) \approx n^{iT}

on {[x,x+H]} for most {x}, thus

\displaystyle  \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) n^{-iT}|\ dx \gg HX.

But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function {n \mapsto \lambda(n) n^{-iT}} is known to be non-pretentious).

I’ve just uploaded to the arXiv my paper “Embedding the Heisenberg group into a bounded dimensional Euclidean space with optimal distortion“, submitted to Revista Matematica Iberoamericana. This paper concerns the extent to which one can accurately embed the metric structure of the Heisenberg group

\displaystyle H := \begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix}

into Euclidean space, which we can write as {\{ [x,y,z]: x,y,z \in {\bf R} \}} with the notation

\displaystyle [x,y,z] := \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}.

Here we give {H} the right-invariant Carnot-Carathéodory metric {d} coming from the right-invariant vector fields

\displaystyle X := \frac{\partial}{\partial x} + y \frac{\partial}{\partial z}; \quad Y := \frac{\partial}{\partial y}

but not from the commutator vector field

\displaystyle Z := [Y,X] = \frac{\partial}{\partial z}.

This gives {H} the geometry of a Carnot group. As observed by Semmes, it follows from the Carnot group differentiation theory of Pansu that there is no bilipschitz map from {(H,d)} to any Euclidean space {{\bf R}^D} or even to {\ell^2}, since such a map must be differentiable almost everywhere in the sense of Carnot groups, which in particular shows that the derivative map annihilate {Z} almost everywhere, which is incompatible with being bilipschitz.

On the other hand, if one snowflakes the Heisenberg group by replacing the metric {d} with {d^{1-\varepsilon}} for some {0 < \varepsilon < 1}, then it follows from the general theory of Assouad on embedding snowflaked metrics of doubling spaces that {(H,d^{1-\varepsilon})} may be embedded in a bilipschitz fashion into {\ell^2}, or even to {{\bf R}^{D_\varepsilon}} for some {D_\varepsilon} depending on {\varepsilon}.

Of course, the distortion of this bilipschitz embedding must degenerate in the limit {\varepsilon \rightarrow 0}. From the work of Austin-Naor-Tessera and Naor-Neiman it follows that {(H,d^{1-\varepsilon})} may be embedded into {\ell^2} with a distortion of {O( \varepsilon^{-1/2} )}, but no better. The Naor-Neiman paper also embeds {(H,d^{1-\varepsilon})} into a finite-dimensional space {{\bf R}^D} with {D} independent of {\varepsilon}, but at the cost of worsening the distortion to {O(\varepsilon^{-1})}. They then posed the question of whether this worsening of the distortion is necessary.

The main result of this paper answers this question in the negative:

Theorem 1 There exists an absolute constant {D} such that {(H,d^{1-\varepsilon})} may be embedded into {{\bf R}^D} in a bilipschitz fashion with distortion {O(\varepsilon^{-1/2})} for any {0 < \varepsilon \leq 1/2}.

To motivate the proof of this theorem, let us first present a bilipschitz map {\Phi: {\bf R} \rightarrow \ell^2} from the snowflaked line {({\bf R},d_{\bf R}^{1-\varepsilon})} (with {d_{\bf R}} being the usual metric on {{\bf R}}) into complex Hilbert space {\ell^2({\bf C})}. The map is given explicitly as a Weierstrass type function

\displaystyle \Phi(x) := \sum_{k \in {\bf Z}} 2^{-\varepsilon k} (\phi_k(x) - \phi_k(0))

where for each {k}, {\phi_k: {\bf R} \rightarrow \ell^2} is the function

\displaystyle \phi_k(x) := 2^k e^{2\pi i x / 2^k} e_k.

and {(e_k)_{k \in {\bf Z}}} are an orthonormal basis for {\ell^2({\bf C})}. The subtracting of the constant {\phi_k(0)} is purely in order to make the sum convergent as {k \rightarrow \infty}. If {x,y \in {\bf R}} are such that {2^{k_0-2} \leq d_{\bf R}(x,y) \leq 2^{k_0-1}} for some integer {k_0}, one can easily check the bounds

\displaystyle |\phi_k(x) - \phi_k(y)| \lesssim d_{\bf R}(x,y)^{(1-\varepsilon)} \min( 2^{-(1-\varepsilon) (k_0-k)}, 2^{-\varepsilon (k-k_0)} )

with the lower bound

\displaystyle |\phi_{k_0}(x) - \phi_{k_0}(y)| \gtrsim d_{\bf R}(x,y)^{(1-\varepsilon)}

at which point one finds that

\displaystyle d_{\bf R}(x,y)^{1-\varepsilon} \lesssim |\Phi(x) - \Phi(y)| \lesssim \varepsilon^{-1/2} d_{\bf R}(x,y)^{1-\varepsilon}

as desired.

The key here was that each function {\phi_k} oscillated at a different spatial scale {2^k}, and the functions were all orthogonal to each other (so that the upper bound involved a factor of {\varepsilon^{-1/2}} rather than {\varepsilon^{-1}}). One can replicate this example for the Heisenberg group without much difficulty. Indeed, if we let {\Gamma := \{ [a,b,c]: a,b,c \in {\bf Z} \}} be the discrete Heisenberg group, then the nilmanifold {H/\Gamma} is a three-dimensional smooth compact manifold; thus, by the Whitney embedding theorem, it smoothly embeds into {{\bf R}^6}. This gives a smooth immersion {\phi: H \rightarrow {\bf R}^6} which is {\Gamma}-automorphic in the sense that {\phi(p\gamma) = \phi(p)} for all {p \in H} and {\gamma \in \Gamma}. If one then defines {\phi_k: H \rightarrow \ell^2 \otimes {\bf R}^6} to be the function

\displaystyle \phi_k(p) := 2^k \phi( \delta_{2^{-k}}(p) ) \otimes e_k

where {\delta_\lambda: H \rightarrow H} is the scaling map

\displaystyle \delta_\lambda([x,y,z]) := [\lambda x, \lambda y, \lambda^2 z],

then one can repeat the previous arguments to obtain the required bilipschitz bounds

\displaystyle d(p,q)^{1-\varepsilon} \lesssim |\Phi(p) - \Phi(q) \lesssim \varepsilon^{-1/2} d(p,q)^{1-\varepsilon}

for the function

\displaystyle \Phi(p) :=\sum_{k \in {\bf Z}} 2^{-\varepsilon k} (\phi_k(p) - \phi_k(0)).

To adapt this construction to bounded dimension, the main obstruction was the requirement that the {\phi_k} took values in orthogonal subspaces. But if one works things out carefully, it is enough to require the weaker orthogonality requirement

\displaystyle B( \phi_{k_0}, \sum_{k>k_0} 2^{-\varepsilon(k-k_0)} \phi_k ) = 0

for all {k_0 \in {\bf Z}}, where {B(\phi, \psi): H \rightarrow {\bf R}^2} is the bilinear form

\displaystyle B(\phi,\psi) := (X \phi \cdot X \psi, Y \phi \cdot Y \psi ).

One can then try to construct the {\phi_k: H \rightarrow {\bf R}^D} for bounded dimension {D} by an iterative argument. After some standard reductions, the problem becomes this (roughly speaking): given a smooth, slowly varying function {\psi: H \rightarrow {\bf R}^{D}} whose derivatives obey certain quantitative upper and lower bounds, construct a smooth oscillating function {\phi: H \rightarrow {\bf R}^{D}}, whose derivatives also obey certain quantitative upper and lower bounds, which obey the equation

\displaystyle B(\phi,\psi) = 0. \ \ \ \ \ (1)


We view this as an underdetermined system of differential equations for {\phi} (two equations in {D} unknowns; after some reductions, our {D} can be taken to be the explicit value {36}). The trivial solution {\phi=0} to this equation will be inadmissible for our purposes due to the lower bounds we will require on {\phi} (in order to obtain the quantitative immersion property mentioned previously, as well as for a stronger “freeness” property that is needed to close the iteration). Because this construction will need to be iterated, it will be essential that the regularity control on {\phi} is the same as that on {\psi}; one cannot afford to “lose derivatives” when passing from {\psi} to {\phi}.

This problem has some formal similarities with the isometric embedding problem (discussed for instance in this previous post), which can be viewed as the problem of solving an equation of the form {Q(\phi,\phi) = g}, where {(M,g)} is a Riemannian manifold and {Q} is the bilinear form

\displaystyle Q(\phi,\psi)_{ij} = \partial_i \phi \cdot \partial_j \psi.

The isometric embedding problem also has the key obstacle that naive attempts to solve the equation {Q(\phi,\phi)=g} iteratively can lead to an undesirable “loss of derivatives” that prevents one from iterating indefinitely. This obstacle was famously resolved by the Nash-Moser iteration scheme in which one alternates between perturbatively adjusting an approximate solution to improve the residual error term, and mollifying the resulting perturbation to counteract the loss of derivatives. The current equation (1) differs in some key respects from the isometric embedding equation {Q(\phi,\phi)=g}, in particular being linear in the unknown field {\phi} rather than quadratic; nevertheless the key obstacle is the same, namely that naive attempts to solve either equation lose derivatives. Our approach to solving (1) was inspired by the Nash-Moser scheme; in retrospect, I also found similarities with Uchiyama’s constructive proof of the Fefferman-Stein decomposition theorem, discussed in this previous post (and in this recent one).

To motivate this iteration, we first express {B(\phi,\psi)} using the product rule in a form that does not place derivatives directly on the unknown {\phi}:

\displaystyle B(\phi,\psi) = \left( W(\phi \cdot W \psi) - \phi \cdot WW \psi\right)_{W = X,Y} \ \ \ \ \ (2)


This reveals that one can construct solutions {\phi} to (1) by solving the system of equations

\displaystyle \phi \cdot W \psi = \phi \cdot WW \psi = 0 \ \ \ \ \ (3)


for {W \in \{X, Y \}}. Because this system is zeroth order in {\phi}, this can easily be done by linear algebra (even in the presence of a forcing term {B(\phi,\psi)=F}) if one imposes a “freeness” condition (analogous to the notion of a free embedding in the isometric embedding problem) that {X \psi(p), Y \psi(p), XX \psi(p), YY \psi(p)} are linearly independent at each point {p}, which (together with some other technical conditions of a similar nature) one then adds to the list of upper and lower bounds required on {\psi} (with a related bound then imposed on {\phi}, in order to close the iteration). However, as mentioned previously, there is a “loss of derivatives” problem with this construction: due to the presence of the differential operators {W} in (3), a solution {\phi} constructed by this method can only be expected to have two degrees less regularity than {\psi} at best, which makes this construction unsuitable for iteration.

To get around this obstacle (which also prominently appears when solving (linearisations of) the isometric embedding equation {Q(\phi,\phi)=g}), we instead first construct a smooth, low-frequency solution {\phi_{\leq N_0} \colon H \rightarrow {\bf R}^{D}} to a low-frequency equation

\displaystyle B( \phi_{\leq N_0}, P_{\leq N_0} \psi ) = 0 \ \ \ \ \ (4)


where {P_{\leq N_0} \psi} is a mollification of {\psi} (of Littlewood-Paley type) applied at a small spatial scale {1/N_0} for some {N_0}, and then gradually relax the frequency cutoff {P_{\leq N_0}} to deform this low frequency solution {\phi_{\leq N_0}} to a solution {\phi} of the actual equation (1).

We will construct the low-frequency solution {\phi_{\leq N_0}} rather explicitly, using the Whitney embedding theorem to construct an initial oscillating map {f} into a very low dimensional space {{\bf R}^6}, composing it with a Veronese type embedding into a slightly larger dimensional space {{\bf R}^{27}} to obtain a required “freeness” property, and then composing further with a slowly varying isometry {U(p) \colon {\bf R}^{27} \rightarrow {\bf R}^{36}} depending on {P_{\leq N_0}} and constructed by a quantitative topological lemma (relying ultimately on the vanishing of the first few homotopy groups of high-dimensional spheres), in order to obtain the required orthogonality (4). (This sort of “quantitative null-homotopy” was first proposed by Gromov, with some recent progress on optimal bounds by Chambers-Manin-Weinberger and by Chambers-Dotterer-Manin-Weinberger, but we will not need these more advanced results here, as one can rely on the classical qualitative vanishing {\pi^k(S^d)=0} for {k < d} together with a compactness argument to obtain (ineffective) quantitative bounds, which suffice for this application).

To perform the deformation of {\phi_{\leq N_0}} into {\phi}, we must solve what is essentially the linearised equation

\displaystyle B( \dot \phi, \psi ) + B( \phi, \dot \psi ) = 0 \ \ \ \ \ (5)


of (1) when {\phi}, {\psi} (viewed as low frequency functions) are both being deformed at some rates {\dot \phi, \dot \psi} (which should be viewed as high frequency functions). To avoid losing derivatives, the magnitude of the deformation {\dot \phi} in {\phi} should not be significantly greater than the magnitude of the deformation {\dot \psi} in {\psi}, when measured in the same function space norms.

As before, if one directly solves the difference equation (5) using a naive application of (2) with {B(\phi,\dot \psi)} treated as a forcing term, one will lose at least one derivative of regularity when passing from {\dot \psi} to {\dot \phi}. However, observe that (2) (and the symmetry {B(\phi, \dot \psi) = B(\dot \psi,\phi)}) can be used to obtain the identity

\displaystyle B( \dot \phi, \psi ) + B( \phi, \dot \psi ) = \left( W(\dot \phi \cdot W \psi + \dot \psi \cdot W \phi) - (\dot \phi \cdot WW \psi + \dot \psi \cdot WW \phi)\right)_{W = X,Y} \ \ \ \ \ (6)


and then one can solve (5) by solving the system of equations

\displaystyle \dot \phi \cdot W \psi = - \dot \psi \cdot W \phi

for {W \in \{X,XX,Y,YY\}}. The key point here is that this system is zeroth order in both {\dot \phi} and {\dot \psi}, so one can solve this system without losing any derivatives when passing from {\dot \psi} to {\dot \phi}; compare this situation with that of the superficially similar system

\displaystyle \dot \phi \cdot W \psi = - \phi \cdot W \dot \psi

that one would obtain from naively linearising (3) without exploiting the symmetry of {B}. There is still however one residual “loss of derivatives” problem arising from the presence of a differential operator {W} on the {\phi} term, which prevents one from directly evolving this iteration scheme in time without losing regularity in {\phi}. It is here that we borrow the final key idea of the Nash-Moser scheme, which is to replace {\phi} by a mollified version {P_{\leq N} \phi} of itself (where the projection {P_{\leq N}} depends on the time parameter). This creates an error term in (5), but it turns out that this error term is quite small and smooth (being a “high-high paraproduct” of {\nabla \phi} and {\nabla\psi}, it ends up being far more regular than either {\phi} or {\psi}, even with the presence of the derivatives) and can be iterated away provided that the initial frequency cutoff {N_0} is large and the function {\psi} has a fairly high (but finite) amount of regularity (we will eventually use the Hölder space {C^{20,\alpha}} on the Heisenberg group to measure this).