You are currently browsing the category archive for the ‘Mathematics’ category.

As in the previous post, all computations here are at the formal level only.

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with Lagrangian space ${{\cal L} = ({\bf R}^n, \hbox{vol})}$ and Eulerian space ${{\cal E} = ({\bf R}^n, \eta, \hbox{vol})}$, we let ${M}$ be the space of volume-preserving, orientation-preserving maps ${\Phi: {\cal L} \rightarrow {\cal E}}$ from Lagrangian space to Eulerian space. Given a curve ${\Phi: {\bf R} \rightarrow M}$, we can define the Lagrangian velocity field ${\dot \Phi: {\bf R} \times {\cal L} \rightarrow T{\cal E}}$ as the time derivative of ${\Phi}$, and the Eulerian velocity field ${u := \dot \Phi \circ \Phi^{-1}: {\bf R} \times {\cal E} \rightarrow T{\cal E}}$. The volume-preserving nature of ${\Phi}$ ensures that ${u}$ is a divergence-free vector field:

$\displaystyle \nabla \cdot u = 0. \ \ \ \ \ (1)$

If we formally define the functional

$\displaystyle J[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} |u(t,x)|^2\ dx dt = \frac{1}{2} \int_R \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx dt$

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

$\displaystyle [\partial_t + u \cdot \nabla] u = - \nabla p$

$\displaystyle \nabla \cdot u = 0$

for some pressure field ${p: {\bf R} \times {\cal E} \rightarrow {\bf R}}$. As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

$\displaystyle \frac{1}{2} \int_{{\cal E}} |u(t,x)|^2\ dx = \frac{1}{2} \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx;$

the rigid motion symmetries of Eulerian space give conservation of the total momentum

$\displaystyle \int_{{\cal E}} u(t,x)\ dx$

and total angular momentum

$\displaystyle \int_{{\cal E}} x \wedge u(t,x)\ dx;$

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

$\displaystyle \int_{\Phi(\gamma)} u^*$

for any closed loop ${\gamma}$ in ${{\cal L}}$, or equivalently pointwise conservation of the Lagrangian vorticity ${\Phi^* \omega = \Phi^* du^*}$, where ${u^*}$ is the ${1}$-form associated with the vector field ${u}$ using the Euclidean metric ${\eta}$ on ${{\cal E}}$, with ${\Phi^*}$ denoting pullback by ${\Phi}$.

It turns out that one can generalise the above calculations. Given any self-adjoint operator ${A}$ on divergence-free vector fields ${u: {\cal E} \rightarrow {\bf R}}$, we can define the functional

$\displaystyle J_A[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx dt;$

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

$\displaystyle [\partial_t + u \cdot \nabla] Au + (\nabla u) \cdot Au= - \nabla \tilde p \ \ \ \ \ (2)$

$\displaystyle \nabla \cdot u = 0$

for some pressure field ${\tilde p: {\bf R} \times {\cal E} \rightarrow {\bf R}}$, where ${(\nabla u) \cdot Au}$ in coordinates is ${\partial_i u_j Au_j}$ with the usual summation conventions. (When ${A=1}$, ${(\nabla u) \cdot Au = \nabla(\frac{1}{2} |u|^2)}$, and this term can be absorbed into the pressure ${\tilde p}$, and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

$\displaystyle \frac{1}{2} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx.$

If the operator ${A}$ commutes with rigid motions on ${{\cal E}}$, then we have conservation of total momentum

$\displaystyle \int_{{\cal E}} Au(t,x)\ dx$

and total angular momentum

$\displaystyle \int_{{\cal E}} x \wedge Au(t,x)\ dx,$

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

$\displaystyle \int_{\Phi(\gamma)} (Au)^*$

or pointwise conservation of the Lagrangian vorticity ${\Phi^* \theta := \Phi^* d(Au)^*}$. These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions ${n=2}$, when ${A}$ is the inverse derivative ${A = |\nabla|^{-1} = (-\Delta)^{-1/2}}$. The vorticity ${\theta = d(Au)^*}$ is a ${2}$-form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write ${u = (u_1,u_2)}$, then

$\displaystyle \theta = \partial_{x_1} |\nabla|^{-1} u_2 - \partial_{x_2} |\nabla|^{-1} u_1.$

Since ${u}$ is also divergence-free, we may therefore write

$\displaystyle u = (- \partial_{x_2} \psi, \partial_{x_1} \psi )$

where the stream function ${\psi}$ is given by the formula

$\displaystyle \psi = |\nabla|^{-1} \theta.$

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

$\displaystyle [\partial_t + u \cdot \nabla] \theta = 0 \ \ \ \ \ (3)$

$\displaystyle u = (-\partial_{x_2} |\nabla|^{-1} \theta, \partial_{x_1} |\nabla|^{-1} \theta).$

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

$\displaystyle \frac{1}{2} \int_{{\bf R}^2} u\cdot |\nabla|^{-1} u\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta \psi\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta |\nabla|^{-1} \theta\ dx$

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As ${A}$ commutes with rigid motions, we also have (formally, at least) conservation of momentum

$\displaystyle \int_{{\bf R}^2} Au\ dx$

(which up to trivial transformations is also expressible in impulse form as ${\int_{{\bf R}^2} \theta x\ dx}$, after integration by parts), and conservation of angular momentum

$\displaystyle \int_{{\bf R}^2} x \wedge Au\ dx$

(which up to trivial transformations is ${\int_{{\bf R}^2} \theta |x|^2\ dx}$). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity ${\Phi^* \theta}$, thus ${\theta}$ is transported by the flow (which is also evident from (3). In particular, all integrals of the form ${\int F(\theta)\ dx}$ for a fixed function ${F}$ are conserved by the flow.

Throughout this post, we will work only at the formal level of analysis, ignoring issues of convergence of integrals, justifying differentiation under the integral sign, and so forth. (Rigorous justification of the conservation laws and other identities arising from the formal manipulations below can usually be established in an a posteriori fashion once the identities are in hand, without the need to rigorously justify the manipulations used to come up with these identities).

It is a remarkable fact in the theory of differential equations that many of the ordinary and partial differential equations that are of interest (particularly in geometric PDE, or PDE arising from mathematical physics) admit a variational formulation; thus, a collection ${\Phi: \Omega \rightarrow M}$ of one or more fields on a domain ${\Omega}$ taking values in a space ${M}$ will solve the differential equation of interest if and only if ${\Phi}$ is a critical point to the functional

$\displaystyle J[\Phi] := \int_\Omega L( x, \Phi(x), D\Phi(x) )\ dx \ \ \ \ \ (1)$

involving the fields ${\Phi}$ and their first derivatives ${D\Phi}$, where the Lagrangian ${L: \Sigma \rightarrow {\bf R}}$ is a function on the vector bundle ${\Sigma}$ over ${\Omega \times M}$ consisting of triples ${(x, q, \dot q)}$ with ${x \in \Omega}$, ${q \in M}$, and ${\dot q: T_x \Omega \rightarrow T_q M}$ a linear transformation; we also usually keep the boundary data of ${\Phi}$ fixed in case ${\Omega}$ has a non-trivial boundary, although we will ignore these issues here. (We also ignore the possibility of having additional constraints imposed on ${\Phi}$ and ${D\Phi}$, which require the machinery of Lagrange multipliers to deal with, but which will only serve as a distraction for the current discussion.) It is common to use local coordinates to parameterise ${\Omega}$ as ${{\bf R}^d}$ and ${M}$ as ${{\bf R}^n}$, in which case ${\Sigma}$ can be viewed locally as a function on ${{\bf R}^d \times {\bf R}^n \times {\bf R}^{dn}}$.

Example 1 (Geodesic flow) Take ${\Omega = [0,1]}$ and ${M = (M,g)}$ to be a Riemannian manifold, which we will write locally in coordinates as ${{\bf R}^n}$ with metric ${g_{ij}(q)}$ for ${i,j=1,\dots,n}$. A geodesic ${\gamma: [0,1] \rightarrow M}$ is then a critical point (keeping ${\gamma(0),\gamma(1)}$ fixed) of the energy functional

$\displaystyle J[\gamma] := \frac{1}{2} \int_0^1 g_{\gamma(t)}( D\gamma(t), D\gamma(t) )\ dt$

or in coordinates (ignoring coordinate patch issues, and using the usual summation conventions)

$\displaystyle J[\gamma] = \frac{1}{2} \int_0^1 g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t)\ dt.$

As discussed in this previous post, both the Euler equations for rigid body motion, and the Euler equations for incompressible inviscid flow, can be interpreted as geodesic flow (though in the latter case, one has to work really formally, as the manifold ${M}$ is now infinite dimensional).

More generally, if ${\Omega = (\Omega,h)}$ is itself a Riemannian manifold, which we write locally in coordinates as ${{\bf R}^d}$ with metric ${h_{ab}(x)}$ for ${a,b=1,\dots,d}$, then a harmonic map ${\Phi: \Omega \rightarrow M}$ is a critical point of the energy functional

$\displaystyle J[\Phi] := \frac{1}{2} \int_\Omega h(x) \otimes g_{\gamma(x)}( D\gamma(x), D\gamma(x) )\ dh(x)$

or in coordinates (again ignoring coordinate patch issues)

$\displaystyle J[\Phi] = \frac{1}{2} \int_{{\bf R}^d} h_{ab}(x) g_{ij}(\Phi(x)) (\partial_a \Phi^i(x)) (\partial_b \Phi^j(x))\ \sqrt{\det(h(x))}\ dx.$

If we replace the Riemannian manifold ${\Omega}$ by a Lorentzian manifold, such as Minkowski space ${{\bf R}^{1+3}}$, then the notion of a harmonic map is replaced by that of a wave map, which generalises the scalar wave equation (which corresponds to the case ${M={\bf R}}$).

Example 2 (${N}$-particle interactions) Take ${\Omega = {\bf R}}$ and ${M = {\bf R}^3 \otimes {\bf R}^N}$; then a function ${\Phi: \Omega \rightarrow M}$ can be interpreted as a collection of ${N}$ trajectories ${q_1,\dots,q_N: {\bf R} \rightarrow {\bf R}^3}$ in space, which we give a physical interpretation as the trajectories of ${N}$ particles. If we assign each particle a positive mass ${m_1,\dots,m_N > 0}$, and also introduce a potential energy function ${V: M \rightarrow {\bf R}}$, then it turns out that Newton’s laws of motion ${F=ma}$ in this context (with the force ${F_i}$ on the ${i^{th}}$ particle being given by the conservative force ${-\nabla_{q_i} V}$) are equivalent to the trajectories ${q_1,\dots,q_N}$ being a critical point of the action functional

$\displaystyle J[\Phi] := \int_{\bf R} \sum_{i=1}^N \frac{1}{2} m_i |\dot q_i(t)|^2 - V( q_1(t),\dots,q_N(t) )\ dt.$

Formally, if ${\Phi = \Phi_0}$ is a critical point of a functional ${J[\Phi]}$, this means that

$\displaystyle \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0$

whenever ${s \mapsto \Phi[s]}$ is a (smooth) deformation with ${\Phi[0]=\Phi_0}$ (and with ${\Phi[s]}$ respecting whatever boundary conditions are appropriate). Interchanging the derivative and integral, we (formally, at least) arrive at

$\displaystyle \int_\Omega \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}\ dx = 0. \ \ \ \ \ (2)$

Write ${\delta \Phi := \frac{d}{ds} \Phi[s]|_{s=0}}$ for the infinitesimal deformation of ${\Phi_0}$. By the chain rule, ${\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}}$ can be expressed in terms of ${x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}$. In coordinates, we have

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \delta \Phi^i(x) L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) \ \ \ \ \ (3)$

$\displaystyle + \partial_{x^a} \delta \Phi^i(x) L_{\partial_{x^a} q^i} (x,\Phi_0(x), D\Phi_0(x)),$

where we parameterise ${\Sigma}$ by ${x, (q^i)_{i=1,\dots,n}, (\partial_{x^a} q^i)_{a=1,\dots,d; i=1,\dots,n}}$, and we use subscripts on ${L}$ to denote partial derivatives in the various coefficients. (One can of course work in a coordinate-free manner here if one really wants to, but the notation becomes a little cumbersome due to the need to carefully split up the tangent space of ${\Sigma}$, and we will not do so here.) Thus we can view (2) as an integral identity that asserts the vanishing of a certain integral, whose integrand involves ${x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}$, where ${\delta \Phi}$ vanishes at the boundary but is otherwise unconstrained.

A general rule of thumb in PDE and calculus of variations is that whenever one has an integral identity of the form ${\int_\Omega F(x)\ dx = 0}$ for some class of functions ${F}$ that vanishes on the boundary, then there must be an associated differential identity ${F = \hbox{div} X}$ that justifies this integral identity through Stokes’ theorem. This rule of thumb helps explain why integration by parts is used so frequently in PDE to justify integral identities. The rule of thumb can fail when one is dealing with “global” or “cohomologically non-trivial” integral identities of a topological nature, such as the Gauss-Bonnet or Kazhdan-Warner identities, but is quite reliable for “local” or “cohomologically trivial” identities, such as those arising from calculus of variations.

In any case, if we apply this rule to (2), we expect that the integrand ${\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}}$ should be expressible as a spatial divergence. This is indeed the case:

Proposition 1 (Formal) Let ${\Phi = \Phi_0}$ be a critical point of the functional ${J[\Phi]}$ defined in (1). Then for any deformation ${s \mapsto \Phi[s]}$ with ${\Phi[0] = \Phi_0}$, we have

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} X \ \ \ \ \ (4)$

where ${X}$ is the vector field that is expressible in coordinates as

$\displaystyle X^a := \delta \Phi^i(x) L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)). \ \ \ \ \ (5)$

Proof: Comparing (4) with (3), we see that the claim is equivalent to the Euler-Lagrange equation

$\displaystyle L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) = 0. \ \ \ \ \ (6)$

The same computation, together with an integration by parts, shows that (2) may be rewritten as

$\displaystyle \int_\Omega ( L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) ) \delta \Phi^i(x)\ dx = 0.$

Since ${\delta \Phi^i(x)}$ is unconstrained on the interior of ${\Omega}$, the claim (6) follows (at a formal level, at least). $\Box$

Many variational problems also enjoy one-parameter continuous symmetries: given any field ${\Phi_0}$ (not necessarily a critical point), one can place that field in a one-parameter family ${s \mapsto \Phi[s]}$ with ${\Phi[0] = \Phi_0}$, such that

$\displaystyle J[ \Phi[s] ] = J[ \Phi[0] ]$

for all ${s}$; in particular,

$\displaystyle \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0,$

which can be written as (2) as before. Applying the previous rule of thumb, we thus expect another divergence identity

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} Y \ \ \ \ \ (7)$

whenever ${s \mapsto \Phi[s]}$ arises from a continuous one-parameter symmetry. This expectation is indeed the case in many examples. For instance, if the spatial domain ${\Omega}$ is the Euclidean space ${{\bf R}^d}$, and the Lagrangian (when expressed in coordinates) has no direct dependence on the spatial variable ${x}$, thus

$\displaystyle L( x, \Phi(x), D\Phi(x) ) = L( \Phi(x), D\Phi(x) ), \ \ \ \ \ (8)$

then we obtain ${d}$ translation symmetries

$\displaystyle \Phi[s](x) := \Phi(x - s e^a )$

for ${a=1,\dots,d}$, where ${e^1,\dots,e^d}$ is the standard basis for ${{\bf R}^d}$. For a fixed ${a}$, the left-hand side of (7) then becomes

$\displaystyle \frac{d}{ds} L( \Phi(x-se^a), D\Phi(x-se^a) )|_{s=0} = -\partial_{x^a} [ L( \Phi(x), D\Phi(x) ) ]$

$\displaystyle = \hbox{div} Y$

where ${Y(x) = - L(\Phi(x), D\Phi(x)) e^a}$. Another common type of symmetry is a pointwise symmetry, in which

$\displaystyle L( x, \Phi[s](x), D\Phi[s](x) ) = L( x, \Phi[0](x), D\Phi[0](x) ) \ \ \ \ \ (9)$

for all ${x}$, in which case (7) clearly holds with ${Y=0}$.

If we subtract (4) from (7), we obtain the celebrated theorem of Noether linking symmetries with conservation laws:

Theorem 2 (Noether’s theorem) Suppose that ${\Phi_0}$ is a critical point of the functional (1), and let ${\Phi[s]}$ be a one-parameter continuous symmetry with ${\Phi[0] = \Phi_0}$. Let ${X}$ be the vector field in (5), and let ${Y}$ be the vector field in (7). Then we have the pointwise conservation law

$\displaystyle \hbox{div}(X-Y) = 0.$

In particular, for one-dimensional variational problems, in which ${\Omega \subset {\bf R}}$, we have the conservation law ${(X-Y)(t) = (X-Y)(0)}$ for all ${t \in \Omega}$ (assuming of course that ${\Omega}$ is connected and contains ${0}$).

Noether’s theorem gives a systematic way to locate conservation laws for solutions to variational problems. For instance, if ${\Omega \subset {\bf R}}$ and the Lagrangian has no explicit time dependence, thus

$\displaystyle L(t, \Phi(t), \dot \Phi(t)) = L(\Phi(t), \dot \Phi(t)),$

then by using the time translation symmetry ${\Phi[s](t) := \Phi(t-s)}$, we have

$\displaystyle Y(t) = - L( \Phi(t), \dot\Phi(t) )$

as discussed previously, whereas we have ${\delta \Phi(t) = - \dot \Phi(t)}$, and hence by (5)

$\displaystyle X(t) := - \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t)),$

and so Noether’s theorem gives conservation of the Hamiltonian

$\displaystyle H(t) := \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t))- L(\Phi(t), \dot \Phi(t)). \ \ \ \ \ (10)$

For instance, for geodesic flow, the Hamiltonian works out to be

$\displaystyle H(t) = \frac{1}{2} g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t),$

so we see that the speed of the geodesic is conserved over time.

For pointwise symmetries (9), ${Y}$ vanishes, and so Noether’s theorem simplifies to ${\hbox{div} X = 0}$; in the one-dimensional case ${\Omega \subset {\bf R}}$, we thus see from (5) that the quantity

$\displaystyle \delta \Phi^i(t) L_{\dot q^i}(t,\Phi_0(t), \dot \Phi_0(t)) \ \ \ \ \ (11)$

is conserved in time. For instance, for the ${N}$-particle system in Example 2, if we have the translation invariance

$\displaystyle V( q_1 + h, \dots, q_N + h ) = V( q_1, \dots, q_N )$

for all ${q_1,\dots,q_N,h \in {\bf R}^3}$, then we have the pointwise translation symmetry

$\displaystyle q_i[s](t) := q_i(t) + s e^j$

for all ${i=1,\dots,N}$, ${s \in{\bf R}}$ and some ${j=1,\dots,3}$, in which case ${\dot q_i(t) = e^j}$, and the conserved quantity (11) becomes

$\displaystyle \sum_{i=1}^n m_i \dot q_i^j(t);$

as ${j=1,\dots,3}$ was arbitrary, this establishes conservation of the total momentum

$\displaystyle \sum_{i=1}^n m_i \dot q_i(t).$

Similarly, if we have the rotation invariance

$\displaystyle V( R q_1, \dots, Rq_N ) = V( q_1, \dots, q_N )$

for any ${q_1,\dots,q_N \in {\bf R}^3}$ and ${R \in SO(3)}$, then we have the pointwise rotation symmetry

$\displaystyle q_i[s](t) := \exp( s A ) q_i(t)$

for any skew-symmetric real ${3 \times 3}$ matrix ${A}$, in which case ${\dot q_i(t) = A q_i(t)}$, and the conserved quantity (11) becomes

$\displaystyle \sum_{i=1}^n m_i \langle A q_i(t), \dot q_i(t) \rangle;$

since ${A}$ is an arbitrary skew-symmetric matrix, this establishes conservation of the total angular momentum

$\displaystyle \sum_{i=1}^n m_i q_i(t) \wedge \dot q_i(t).$

Below the fold, I will describe how Noether’s theorem can be used to locate all of the conserved quantities for the Euler equations of inviscid fluid flow, discussed in this previous post, by interpreting that flow as geodesic flow in an infinite dimensional manifold.

The Euler equations for incompressible inviscid fluids may be written as

$\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p$

$\displaystyle \nabla \cdot u = 0$

where ${u: [0,T] \times {\bf R}^n \rightarrow {\bf R}^n}$ is the velocity field, and ${p: [0,T] \times {\bf R}^n \rightarrow {\bf R}}$ is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that ${u}$ is bounded. We will take the dimension ${n}$ to be at least two, with the three-dimensional case ${n=3}$ being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when ${n=3}$ is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the conservation laws for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

$\displaystyle \int_{{\bf R}^n} |u|^2\ dx \ \ \ \ \ (1)$

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

$\displaystyle \int_{{\bf R}^3} u \cdot (\nabla \times u)\ dx \ \ \ \ \ (2)$

and (formally, at least) one has conservation of momentum

$\displaystyle \int_{{\bf R}^3} u\ dx$

and angular momentum

$\displaystyle \int_{{\bf R}^3} x \times u\ dx$

(although, as we shall discuss below, due to the slow decay of ${u}$ at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

$\displaystyle \int_{{\bf R}^3} \nabla \times u\ dx$

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

This is the ninth thread for the Polymath8b project to obtain new bounds for the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

either for small values of ${m}$ (in particular ${m=1,2}$) or asymptotically as ${m \rightarrow \infty}$. The previous thread may be found here. The currently best known bounds on ${H_m}$ can be found at the wiki page.

The focus is now on bounding ${H_1}$ unconditionally (in particular, without resorting to the Elliott-Halberstam conjecture or its generalisations). We can bound ${H_1 \leq H(k)}$ whenever one can find a symmetric square-integrable function ${F}$ supported on the simplex ${{\cal R}_k := \{ (t_1,\dots,t_k) \in [0,+\infty)^k: t_1+\dots+t_k \leq 1 \}}$ such that

$\displaystyle k \int_{{\cal R}_{k-1}} (\int_0^\infty F(t_1,\dots,t_k)\ dt_k)^2\ dt_1 \dots dt_{k-1} \ \ \ \ \ (1)$

$\displaystyle > 4 \int_{{\cal R}_{k}} F(t_1,\dots,t_k)^2\ dt_1 \dots dt_{k-1} dt_k.$

Our strategy for establishing this has been to restrict ${F}$ to be a linear combination of symmetrised monomials ${[t_1^{a_1} \dots t_k^{a_k}]_{sym}}$ (restricted of course to ${{\cal R}_k}$), where the degree ${a_1+\dots+a_k}$ is small; actually, it seems convenient to work with the slightly different basis ${(1-t_1-\dots-t_k)^i [t_1^{a_1} \dots t_k^{a_k}]_{sym}}$ where the ${a_i}$ are restricted to be even. The criterion (1) then becomes a large quadratic program with explicit but complicated rational coefficients. This approach has lowered ${k}$ down to ${54}$, which led to the bound ${H_1 \leq 270}$.

Actually, we know that the more general criterion

$\displaystyle k \int_{(1-\epsilon) \cdot {\cal R}_{k-1}} (\int_0^\infty F(t_1,\dots,t_k)\ dt_k)^2\ dt_1 \dots dt_{k-1} \ \ \ \ \ (2)$

$\displaystyle > 4 \int F(t_1,\dots,t_k)^2\ dt_1 \dots dt_{k-1} dt_k$

will suffice, whenever ${0 \leq \epsilon < 1}$ and ${F}$ is supported now on ${2 \cdot {\cal R}_k}$ and obeys the vanishing marginal condition ${\int_0^\infty F(t_1,\dots,t_k)\ dt_k = 0}$ whenever ${t_1+\dots+t_k > 1+\epsilon}$. The latter is in particular obeyed when ${F}$ is supported on ${(1+\epsilon) \cdot {\cal R}_k}$. A modification of the preceding strategy has lowered ${k}$ slightly to ${53}$, giving the bound ${H_1 \leq 264}$ which is currently our best record.

However, the quadratic programs here have become extremely large and slow to run, and more efficient algorithms (or possibly more computer power) may be required to advance further.

This is the eighth thread for the Polymath8b project to obtain new bounds for the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

either for small values of ${m}$ (in particular ${m=1,2}$) or asymptotically as ${m \rightarrow \infty}$. The previous thread may be found here. The currently best known bounds on ${H_m}$ can be found at the wiki page.

The big news since the last thread is that we have managed to obtain the (sieve-theoretically) optimal bound of ${H_1 \leq 6}$ assuming the generalised Elliott-Halberstam conjecture (GEH), which pretty much closes off that part of the story. Unconditionally, our bound on ${H_1}$ is still ${H_1 \leq 270}$. This bound was obtained using the “vanilla” Maynard sieve, in which the cutoff ${F}$ was supported in the original simplex ${\{ t_1+\dots+t_k \leq 1\}}$, and only Bombieri-Vinogradov was used. In principle, we can enlarge the sieve support a little bit further now; for instance, we can enlarge to ${\{ t_1+\dots+t_k \leq \frac{k}{k-1} \}}$, but then have to shrink the J integrals to ${\{t_1+\dots+t_{k-1} \leq 1-\epsilon\}}$, provided that the marginals vanish for ${\{ t_1+\dots+t_{k-1} \geq 1+\epsilon \}}$. However, we do not yet know how to numerically work with these expanded problems.

Given the substantial progress made so far, it looks like we are close to the point where we should declare victory and write up the results (though we should take one last look to see if there is any room to improve the ${H_1 \leq 270}$ bounds). There is actually a fair bit to write up:

• Improvements to the Maynard sieve (pushing beyond the simplex, the epsilon trick, and pushing beyond the cube);
• Asymptotic bounds for ${M_k}$ and hence ${H_m}$;
• Explicit bounds for ${H_m, m \geq 2}$ (using the Polymath8a results)
• ${H_1 \leq 270}$;
• ${H_1 \leq 6}$ on GEH (and parity obstructions to any further improvement).

I will try to create a skeleton outline of such a paper in the Polymath8 Dropbox folder soon. It shouldn’t be nearly as big as the Polymath8a paper, but it will still be quite sizeable.

There are multiple purposes to this blog post.

The first purpose is to announce the uploading of the paper “New equidistribution estimates of Zhang type, and bounded gaps between primes” by D.H.J. Polymath, which is the main output of the Polymath8a project on bounded gaps between primes, to the arXiv, and to describe the main results of this paper below the fold.

The second purpose is to roll over the previous thread on all remaining Polymath8a-related matters (e.g. updates on the submission status of the paper) to a fresh thread. (Discussion of the ongoing Polymath8b project is however being kept on a separate thread, to try to reduce confusion.)

The final purpose of this post is to coordinate the writing of a retrospective article on the Polymath8 experience, which has been solicited for the Newsletter of the European Mathematical Society. I suppose that this could encompass both the Polymath8a and Polymath8b projects, even though the second one is still ongoing (but I think we will soon be entering the endgame there). I think there would be two main purposes of such a retrospective article. The first one would be to tell a story about the process of conducting mathematical research, rather than just describe the outcome of such research; this is an important aspect of the subject which is given almost no attention in most mathematical writing, and it would be good to be able to capture some sense of this process while memories are still relatively fresh. The other would be to draw some tentative conclusions with regards to what the strengths and weaknesses of a Polymath project are, and how appropriate such a format would be for other mathematical problems than bounded gaps between primes. In my opinion, the bounded gaps problem had some fairly unique features that made it particularly amenable to a Polymath project, such as (a) a high level of interest amongst the mathematical community in the problem; (b) a very focused objective (“improve ${H}$!”), which naturally provided an obvious metric to measure progress; (c) the modular nature of the project, which allowed for people to focus on one aspect of the problem only, and still make contributions to the final goal; and (d) a very reasonable level of ambition (for instance, we did not attempt to prove the twin prime conjecture, which in my opinion would make a terrible Polymath project at our current level of mathematical technology). This is not an exhaustive list of helpful features of the problem; I would welcome other diagnoses of the project by other participants.

With these two objectives in mind, I propose a format for the retrospective article consisting of a brief introduction to the polymath concept in general and the polymath8 project in particular, followed by a collection of essentially independent contributions by different participants on their own experiences and thoughts. Finally we could have a conclusion section in which we make some general remarks on the polymath project (such as the remarks above). I’ve started a dropbox subfolder for this article (currently in a very skeletal outline form only), and will begin writing a section on my own experiences; other participants are of course encouraged to add their own sections (it is probably best to create separate files for these, and then input them into the main file retrospective.tex, to reduce edit conflicts. If there are participants who wish to contribute but do not currently have access to the Dropbox folder, please email me and I will try to have you added (or else you can supply your thoughts by email, or in the comments to this post; we may have a section for shorter miscellaneous comments from more casual participants, for people who don’t wish to write a lengthy essay on the subject).

As for deadlines, the EMS Newsletter would like a submitted article by mid-April in order to make the June issue, but in the worst case, it will just be held over until the issue after that.

I’ve just uploaded to the arXiv the paper “Finite time blowup for an averaged three-dimensional Navier-Stokes equation“, submitted to J. Amer. Math. Soc.. The main purpose of this paper is to formalise the “supercriticality barrier” for the global regularity problem for the Navier-Stokes equation, which roughly speaking asserts that it is not possible to establish global regularity by any “abstract” approach which only uses upper bound function space estimates on the nonlinear part of the equation, combined with the energy identity. This is done by constructing a modification of the Navier-Stokes equations with a nonlinearity that obeys essentially all of the function space estimates that the true Navier-Stokes nonlinearity does, and which also obeys the energy identity, but for which one can construct solutions that blow up in finite time. Results of this type had been previously established by Montgomery-Smith, Gallagher-Paicu, and Li-Sinai for variants of the Navier-Stokes equation without the energy identity, and by Katz-Pavlovic and by Cheskidov for dyadic analogues of the Navier-Stokes equations in five and higher dimensions that obeyed the energy identity (see also the work of Plechac and Sverak and of Hou and Lei that also suggest blowup for other Navier-Stokes type models obeying the energy identity in five and higher dimensions), but to my knowledge this is the first blowup result for a Navier-Stokes type equation in three dimensions that also obeys the energy identity. Intriguingly, the method of proof in fact hints at a possible route to establishing blowup for the true Navier-Stokes equations, which I am now increasingly inclined to believe is the case (albeit for a very small set of initial data).

To state the results more precisely, recall that the Navier-Stokes equations can be written in the form

$\displaystyle \partial_t u + (u \cdot \nabla) u = \nu \Delta u + \nabla p$

for a divergence-free velocity field ${u}$ and a pressure field ${p}$, where ${\nu>0}$ is the viscosity, which we will normalise to be one. We will work in the non-periodic setting, so the spatial domain is ${{\bf R}^3}$, and for sake of exposition I will not discuss matters of regularity or decay of the solution (but we will always be working with strong notions of solution here rather than weak ones). Applying the Leray projection ${P}$ to divergence-free vector fields to this equation, we can eliminate the pressure, and obtain an evolution equation

$\displaystyle \partial_t u = \Delta u + B(u,u) \ \ \ \ \ (1)$

purely for the velocity field, where ${B}$ is a certain bilinear operator on divergence-free vector fields (specifically, ${B(u,v) = -\frac{1}{2} P( (u \cdot \nabla) v + (v \cdot \nabla) u)}$. The global regularity problem for Navier-Stokes is then equivalent to the global regularity problem for the evolution equation (1).

An important feature of the bilinear operator ${B}$ appearing in (1) is the cancellation law

$\displaystyle \langle B(u,u), u \rangle = 0$

(using the ${L^2}$ inner product on divergence-free vector fields), which leads in particular to the fundamental energy identity

$\displaystyle \frac{1}{2} \int_{{\bf R}^3} |u(T,x)|^2\ dx + \int_0^T \int_{{\bf R}^3} |\nabla u(t,x)|^2\ dx dt = \frac{1}{2} \int_{{\bf R}^3} |u(0,x)|^2\ dx.$

This identity (and its consequences) provide essentially the only known a priori bound on solutions to the Navier-Stokes equations from large data and arbitrary times. Unfortunately, as discussed in this previous post, the quantities controlled by the energy identity are supercritical with respect to scaling, which is the fundamental obstacle that has defeated all attempts to solve the global regularity problem for Navier-Stokes without any additional assumptions on the data or solution (e.g. perturbative hypotheses, or a priori control on a critical norm such as the ${L^\infty_t L^3_x}$ norm).

Our main result is then (slightly informally stated) as follows

Theorem 1 There exists an averaged version ${\tilde B}$ of the bilinear operator ${B}$, of the form

$\displaystyle \tilde B(u,v) := \int_\Omega m_{3,\omega}(D) Rot_{3,\omega}$

$\displaystyle B( m_{1,\omega}(D) Rot_{1,\omega} u, m_{2,\omega}(D) Rot_{2,\omega} v )\ d\mu(\omega)$

for some probability space ${(\Omega, \mu)}$, some spatial rotation operators ${Rot_{i,\omega}}$ for ${i=1,2,3}$, and some Fourier multipliers ${m_{i,\omega}}$ of order ${0}$, for which one still has the cancellation law

$\displaystyle \langle \tilde B(u,u), u \rangle = 0$

and for which the averaged Navier-Stokes equation

$\displaystyle \partial_t u = \Delta u + \tilde B(u,u) \ \ \ \ \ (2)$

admits solutions that blow up in finite time.

(There are some integrability conditions on the Fourier multipliers ${m_{i,\omega}}$ required in the above theorem in order for the conclusion to be non-trivial, but I am omitting them here for sake of exposition.)

Because spatial rotations and Fourier multipliers of order ${0}$ are bounded on most function spaces, ${\tilde B}$ automatically obeys almost all of the upper bound estimates that ${B}$ does. Thus, this theorem blocks any attempt to prove global regularity for the true Navier-Stokes equations which relies purely on the energy identity and on upper bound estimates for the nonlinearity; one must use some additional structure of the nonlinear operator ${B}$ which is not shared by an averaged version ${\tilde B}$. Such additional structure certainly exists – for instance, the Navier-Stokes equation has a vorticity formulation involving only differential operators rather than pseudodifferential ones, whereas a general equation of the form (2) does not. However, “abstract” approaches to global regularity generally do not exploit such structure, and thus cannot be used to affirmatively answer the Navier-Stokes problem.

It turns out that the particular averaged bilinear operator ${B}$ that we will use will be a finite linear combination of local cascade operators, which take the form

$\displaystyle C(u,v) := \sum_{n \in {\bf Z}} (1+\epsilon_0)^{5n/2} \langle u, \psi_{1,n} \rangle \langle v, \psi_{2,n} \rangle \psi_{3,n}$

where ${\epsilon_0>0}$ is a small parameter, ${\psi_1,\psi_2,\psi_3}$ are Schwartz vector fields whose Fourier transform is supported on an annulus, and ${\psi_{i,n}(x) := (1+\epsilon_0)^{3n/2} \psi_i( (1+\epsilon_0)^n x)}$ is an ${L^2}$-rescaled version of ${\psi_i}$ (basically a “wavelet” of wavelength about ${(1+\epsilon_0)^{-n}}$ centred at the origin). Such operators were essentially introduced by Katz and Pavlovic as dyadic models for ${B}$; they have the essentially the same scaling property as ${B}$ (except that one can only scale along powers of ${1+\epsilon_0}$, rather than over all positive reals), and in fact they can be expressed as an average of ${B}$ in the sense of the above theorem, as can be shown after a somewhat tedious amount of Fourier-analytic symbol manipulations.

If we consider nonlinearities ${\tilde B}$ which are a finite linear combination of local cascade operators, then the equation (2) more or less collapses to a system of ODE in certain “wavelet coefficients” of ${u}$. The precise ODE that shows up depends on what precise combination of local cascade operators one is using. Katz and Pavlovic essentially considered a single cascade operator together with its “adjoint” (needed to preserve the energy identity), and arrived (more or less) at the system of ODE

$\displaystyle \partial_t X_n = - (1+\epsilon_0)^{2n} X_n + (1+\epsilon_0)^{\frac{5}{2}(n-1)} X_{n-1}^2 - (1+\epsilon_0)^{\frac{5}{2} n} X_n X_{n+1} \ \ \ \ \ (3)$

where ${X_n: [0,T] \rightarrow {\bf R}}$ are scalar fields for each integer ${n}$. (Actually, Katz-Pavlovic worked with a technical variant of this particular equation, but the differences are not so important for this current discussion.) Note that the quadratic terms on the RHS carry a higher exponent of ${1+\epsilon_0}$ than the dissipation term; this reflects the supercritical nature of this evolution (the energy ${\frac{1}{2} \sum_n X_n^2}$ is monotone decreasing in this flow, so the natural size of ${X_n}$ given the control on the energy is ${O(1)}$). There is a slight technical issue with the dissipation if one wishes to embed (3) into an equation of the form (2), but it is minor and I will not discuss it further here.

In principle, if the ${X_n}$ mode has size comparable to ${1}$ at some time ${t_n}$, then energy should flow from ${X_n}$ to ${X_{n+1}}$ at a rate comparable to ${(1+\epsilon_0)^{\frac{5}{2} n}}$, so that by time ${t_{n+1} \approx t_n + (1+\epsilon_0)^{-\frac{5}{2} n}}$ or so, most of the energy of ${X_n}$ should have drained into the ${X_{n+1}}$ mode (with hardly any energy dissipated). Since the series ${\sum_{n \geq 1} (1+\epsilon_0)^{-\frac{5}{2} n}}$ is summable, this suggests finite time blowup for this ODE as the energy races ever more quickly to higher and higher modes. Such a scenario was indeed established by Katz and Pavlovic (and refined by Cheskidov) if the dissipation strength ${(1+\epsilon)^{2n}}$ was weakened somewhat (the exponent ${2}$ has to be lowered to be less than ${\frac{5}{3}}$). As mentioned above, this is enough to give a version of Theorem 1 in five and higher dimensions.

On the other hand, it was shown a few years ago by Barbato, Morandin, and Romito that (3) in fact admits global smooth solutions (at least in the dyadic case ${\epsilon_0=1}$, and assuming non-negative initial data). Roughly speaking, the problem is that as energy is being transferred from ${X_n}$ to ${X_{n+1}}$, energy is also simultaneously being transferred from ${X_{n+1}}$ to ${X_{n+2}}$, and as such the solution races off to higher modes a bit too prematurely, without absorbing all of the energy from lower modes. This weakens the strength of the blowup to the point where the moderately strong dissipation in (3) is enough to kill the high frequency cascade before a true singularity occurs. Because of this, the original Katz-Pavlovic model cannot quite be used to establish Theorem 1 in three dimensions. (Actually, the original Katz-Pavlovic model had some additional dispersive features which allowed for another proof of global smooth solutions, which is an unpublished result of Nazarov.)

To get around this, I had to “engineer” an ODE system with similar features to (3) (namely, a quadratic nonlinearity, a monotone total energy, and the indicated exponents of ${(1+\epsilon_0)}$ for both the dissipation term and the quadratic terms), but for which the cascade of energy from scale ${n}$ to scale ${n+1}$ was not interrupted by the cascade of energy from scale ${n+1}$ to scale ${n+2}$. To do this, I needed to insert a delay in the cascade process (so that after energy was dumped into scale ${n}$, it would take some time before the energy would start to transfer to scale ${n+1}$), but the process also needed to be abrupt (once the process of energy transfer started, it needed to conclude very quickly, before the delayed transfer for the next scale kicked in). It turned out that one could build a “quadratic circuit” out of some basic “quadratic gates” (analogous to how an electrical circuit could be built out of basic gates such as amplifiers or resistors) that achieved this task, leading to an ODE system essentially of the form

$\displaystyle \partial_t X_{1,n} = - (1+\epsilon_0)^{2n} X_{1,n}$

$\displaystyle + (1+\epsilon_0)^{5n/2} (- \epsilon^{-2} X_{3,n} X_{4,n} - \epsilon X_{1,n} X_{2,n} - \epsilon^2 \exp(-K^{10}) X_{1,n} X_{3,n}$

$\displaystyle + K X_{4,n-1}^2)$

$\displaystyle \partial_t X_{2,n} = - (1+\epsilon_0)^{2n} X_{2,n} + (1+\epsilon_0)^{5n/2} (\epsilon X_{1,n}^2 - \epsilon^{-1} K^{10} X_{3,n}^2)$

$\displaystyle \partial_t X_{3,n} = - (1+\epsilon_0)^{2n} X_{3,n} + (1+\epsilon_0)^{5n/2} (\epsilon^2 \exp(-K^{10}) X_{1,n}^2$

$\displaystyle + \epsilon^{-1} K^{10} X_{2,n} X_{3,n} )$

$\displaystyle \partial_t X_{4,n} =- (1+\epsilon_0)^{2n} X_{4,n} + (1+\epsilon_0)^{5n/2} (\epsilon^{-2} X_{3,n} X_{1,n}$

$\displaystyle - (1+\epsilon_0)^{5/2} K X_{4,n} X_{1,n+1})$

where ${K \geq 1}$ is a suitable large parameter and ${\epsilon > 0}$ is a suitable small parameter (much smaller than ${1/K}$). To visualise the dynamics of such a system, I found it useful to describe this system graphically by a “circuit diagram” that is analogous (but not identical) to the circuit diagrams arising in electrical engineering:

The coupling constants here range widely from being very large to very small; in practice, this makes the ${X_{2,n}}$ and ${X_{3,n}}$ modes absorb very little energy, but exert a sizeable influence on the remaining modes. If a lot of energy is suddenly dumped into ${X_{1,n}}$, what happens next is roughly as follows: for a moderate period of time, nothing much happens other than a trickle of energy into ${X_{2,n}}$, which in turn causes a rapid exponential growth of ${X_{3,n}}$ (from a very low base). After this delay, ${X_{3,n}}$ suddenly crosses a certain threshold, at which point it causes ${X_{1,n}}$ and ${X_{4,n}}$ to exchange energy back and forth with extreme speed. The energy from ${X_{4,n}}$ then rapidly drains into ${X_{1,n+1}}$, and the process begins again (with a slight loss in energy due to the dissipation). If one plots the total energy ${E_n := \frac{1}{2} ( X_{1,n}^2 + X_{2,n}^2 + X_{3,n}^2 + X_{4,n}^2 )}$ as a function of time, it looks schematically like this:

As in the previous heuristic discussion, the time between cascades from one frequency scale to the next decay exponentially, leading to blowup at some finite time ${T}$. (One could describe the dynamics here as being similar to the famous “lighting the beacons” scene in the Lord of the Rings movies, except that (a) as each beacon gets ignited, the previous one is extinguished, as per the energy identity; (b) the time between beacon lightings decrease exponentially; and (c) there is no soundtrack.)

There is a real (but remote) possibility that this sort of construction can be adapted to the true Navier-Stokes equations. The basic blowup mechanism in the averaged equation is that of a von Neumann machine, or more precisely a construct (built within the laws of the inviscid evolution ${\partial_t u = \tilde B(u,u)}$) that, after some time delay, manages to suddenly create a replica of itself at a finer scale (and to largely erase its original instantiation in the process). In principle, such a von Neumann machine could also be built out of the laws of the inviscid form of the Navier-Stokes equations (i.e. the Euler equations). In physical terms, one would have to build the machine purely out of an ideal fluid (i.e. an inviscid incompressible fluid). If one could somehow create enough “logic gates” out of ideal fluid, one could presumably build a sort of “fluid computer”, at which point the task of building a von Neumann machine appears to reduce to a software engineering exercise rather than a PDE problem (providing that the gates are suitably stable with respect to perturbations, but (as with actual computers) this can presumably be done by converting the analog signals of fluid mechanics into a more error-resistant digital form). The key thing missing in this program (in both senses of the word) to establish blowup for Navier-Stokes is to construct the logic gates within the laws of ideal fluids. (Compare with the situation for cellular automata such as Conway’s “Game of Life“, in which Turing complete computers, universal constructors, and replicators have all been built within the laws of that game.)

This is the seventh thread for the Polymath8b project to obtain new bounds for the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

either for small values of ${m}$ (in particular ${m=1,2}$) or asymptotically as ${m \rightarrow \infty}$. The previous thread may be found here. The currently best known bounds on ${H_m}$ can be found at the wiki page.

The current focus is on improving the upper bound on ${H_1}$ under the assumption of the generalised Elliott-Halberstam conjecture (GEH) from ${H_1 \leq 8}$ to ${H_1 \leq 6}$. Very recently, we have been able to exploit GEH more fully, leading to a promising new expansion of the sieve support region. The problem now reduces to the following:

Problem 1 Does there exist a (not necessarily convex) polytope ${R \subset [0,2]^3}$ with quantities ${0 \leq \varepsilon_1,\varepsilon_2,\varepsilon_3 \leq 1}$, and a non-trivial square-integrable function ${F: {\bf R}^3 \rightarrow {\bf R}}$ supported on ${R}$ such that

• ${R + R \subset \{ (x,y,z) \in [0,4]^3: \min(x+y,y+z,z+x) \leq 2 \},}$
• ${\int_0^\infty F(x,y,z)\ dx = 0}$ when ${y+z \geq 1+\varepsilon_1}$;
• ${\int_0^\infty F(x,y,z)\ dy = 0}$ when ${x+z \geq 1+\varepsilon_2}$;
• ${\int_0^\infty F(x,y,z)\ dz = 0}$ when ${x+y \geq 1+\varepsilon_3}$;

and such that we have the inequality

$\displaystyle \int_{y+z \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y,z)\ dx)^2\ dy dz$

$\displaystyle + \int_{z+x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y,z)\ dy)^2\ dz dx$

$\displaystyle + \int_{x+y \leq 1-\varepsilon_3} (\int_{\bf R} F(x,y,z)\ dz)^2\ dx dy$

$\displaystyle > 2 \int_R F(x,y,z)^2\ dx dy dz?$

An affirmative answer to this question will imply ${H_1 \leq 6}$ on GEH. We are “within two percent” of this claim; we cannot quite reach ${2}$ yet, but have got as far as ${1.962998}$. However, we have not yet fully optimised ${F}$ in the above problem. In particular, the simplex

$\displaystyle R = \{ (x,y,z) \in [0,2]^3: x+y+z \leq 3/2 \}$

is now available, and should lead to some noticeable improvement in the numerology.

There is also a very slim chance that the twin prime conjecture is now provable on GEH. It would require an affirmative solution to the following problem:

Problem 2 Does there exist a (not necessarily convex) polytope ${R \subset [0,2]^2}$ with quantities ${0 \leq \varepsilon_1,\varepsilon_2 \leq 1}$, and a non-trivial square-integrable function ${F: {\bf R}^2 \rightarrow {\bf R}}$ supported on ${R}$ such that

• ${R + R \subset \{ (x,y) \in [0,4]^2: \min(x,y) \leq 2 \}}$

$\displaystyle = [0,2] \times [0,4] \cup [0,4] \times [0,2],$

• ${\int_0^\infty F(x,y)\ dx = 0}$ when ${y \geq 1+\varepsilon_1}$;
• ${\int_0^\infty F(x,y)\ dy = 0}$ when ${x \geq 1+\varepsilon_2}$;

and such that we have the inequality

$\displaystyle \int_{y \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y)\ dx)^2\ dy$

$\displaystyle + \int_{x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y)\ dy)^2\ dx$

$\displaystyle > 2 \int_R F(x,y)^2\ dx dy?$

We suspect that the answer to this question is negative, but have not formally ruled it out yet.

For the rest of this post, I will justify why positive answers to these sorts of variational problems are sufficient to get bounds on ${H_1}$ (or more generally ${H_m}$).

This is the sixth thread for the Polymath8b project to obtain new bounds for the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

either for small values of ${m}$ (in particular ${m=1,2}$) or asymptotically as ${m \rightarrow \infty}$. The previous thread may be found here. The currently best known bounds on ${H_m}$ can be found at the wiki page (which has recently returned to full functionality, after a partial outage).

The current focus is on improving the upper bound on ${H_1}$ under the assumption of the generalised Elliott-Halberstam conjecture (GEH) from ${H_1 \leq 8}$ to ${H_1 \leq 6}$, which looks to be the limit of the method (see this previous comment for a semi-rigorous reason as to why ${H_1 \leq 4}$ is not possible with this method). With the most general Selberg sieve available, the problem reduces to the following three-dimensional variational one:

Problem 1 Does there exist a (not necessarily convex) polytope ${R \subset [0,1]^3}$ with quantities ${0 \leq \varepsilon_1,\varepsilon_2,\varepsilon_3 \leq 1}$, and a non-trivial square-integrable function ${F: {\bf R}^3 \rightarrow {\bf R}}$ supported on ${R}$ such that

• ${R + R \subset \{ (x,y,z) \in [0,2]^3: \min(x+y,y+z,z+x) \leq 2 \},}$
• ${\int_0^\infty F(x,y,z)\ dx = 0}$ when ${y+z \geq 1+\varepsilon_1}$;
• ${\int_0^\infty F(x,y,z)\ dy = 0}$ when ${x+z \geq 1+\varepsilon_2}$;
• ${\int_0^\infty F(x,y,z)\ dz = 0}$ when ${x+y \geq 1+\varepsilon_3}$;

and such that we have the inequality

$\displaystyle \int_{y+z \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y,z)\ dx)^2\ dy dz$

$\displaystyle + \int_{z+x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y,z)\ dy)^2\ dz dx$

$\displaystyle + \int_{x+y \leq 1-\varepsilon_3} (\int_{\bf R} F(x,y,z)\ dz)^2\ dx dy$

$\displaystyle > 2 \int_R F(x,y,z)^2\ dx dy dz?$

(Initially it was assumed that ${R}$ was convex, but we have now realised that this is not necessary.)

An affirmative answer to this question will imply ${H_1 \leq 6}$ on GEH. We are “within almost two percent” of this claim; we cannot quite reach ${2}$ yet, but have got as far as ${1.959633}$. However, we have not yet fully optimised ${F}$ in the above problem.

The most promising route so far is to take the symmetric polytope

$\displaystyle R = \{ (x,y,z) \in [0,1]^3: x+y+z \leq 3/2 \}$

with ${F}$ symmetric as well, and ${\varepsilon_1=\varepsilon_2=\varepsilon_3=\varepsilon}$ (we suspect that the optimal ${\varepsilon}$ will be roughly ${1/6}$). (However, it is certainly worth also taking a look at easier model problems, such as the polytope ${{\cal R}'_3 := \{ (x,y,z) \in [0,1]^3: x+y,y+z,z+x \leq 1\}}$, which has no vanishing marginal conditions to contend with; more recently we have been looking at the non-convex polytope ${R = \{x+y,x+z \leq 1 \} \cup \{ x+y,y+z \leq 1 \} \cup \{ x+z,y+z \leq 1\}}$.) Some further details of this particular case are given below the fold.

There should still be some progress to be made in the other regimes of interest – the unconditional bound on ${H_1}$ (currently at ${270}$), and on any further progress in asymptotic bounds for ${H_m}$ for larger ${m}$ – but the current focus is certainly on the bound on ${H_1}$ on GEH, as we seem to be tantalisingly close to an optimal result here.

This is the fifth thread for the Polymath8b project to obtain new bounds for the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

either for small values of ${m}$ (in particular ${m=1,2}$) or asymptotically as ${m \rightarrow \infty}$. The previous thread may be found here. The currently best known bounds on ${H_m}$ can be found at the wiki page (which has recently returned to full functionality, after a partial outage). In particular, the upper bound for ${H_1}$ has been shaved a little from ${272}$ to ${270}$, and we have very recently achieved the bound ${H_1 \leq 8}$ on the generalised Elliott-Halberstam conjecture GEH, formulated as Conjecture 1 of this paper of Bombieri, Friedlander, and Iwaniec. We also have explicit bounds for ${H_m}$ for ${m \leq 5}$, both with and without the assumption of the Elliott-Halberstam conjecture, as well as slightly sharper asymptotics for the upper bound for ${H_m}$ as ${m \rightarrow \infty}$.

The basic strategy for bounding ${H_m}$ still follows the general paradigm first laid out by Goldston, Pintz, Yildirim: given an admissible ${k}$-tuple ${(h_1,\dots,h_k)}$, one needs to locate a non-negative sieve weight ${\nu: {\bf Z} \rightarrow {\bf R}^+}$, supported on an interval ${[x,2x]}$ for a large ${x}$, such that the ratio

$\displaystyle \frac{\sum_{i=1}^k \sum_n \nu(n) 1_{n+h_i \hbox{ prime}}}{\sum_n \nu(n)} \ \ \ \ \ (1)$

is asymptotically larger than ${m}$ as ${x \rightarrow \infty}$; this will show that ${H_m \leq h_k-h_1}$. Thus one wants to locate a sieve weight ${\nu}$ for which one has good lower bounds on the numerator and good upper bounds on the denominator.

One can modify this paradigm slightly, for instance by adding the additional term ${\sum_n \nu(n) 1_{n+h_1,\dots,n+h_k \hbox{ composite}}}$ to the numerator, or by subtracting the term ${\sum_n \nu(n) 1_{n+h_1,n+h_k \hbox{ prime}}}$ from the numerator (which allows one to reduce the bound ${h_k-h_1}$ to ${\max(h_k-h_2,h_{k-1}-h_1)}$); however, the numerical impact of these tweaks have proven to be negligible thus far.

Despite a number of experiments with other sieves, we are still relying primarily on the Selberg sieve

$\displaystyle \nu(n) := 1_{n=b\ (W)} 1_{[x,2x]}(n) \lambda(n)^2$

where ${\lambda(n)}$ is the divisor sum

$\displaystyle \lambda(n) := \sum_{d_1|n+h_1, \dots, d_k|n+h_k} \mu(d_1) \dots \mu(d_k) f( \frac{\log d_1}{\log R}, \dots, \frac{\log d_k}{\log R})$

with ${R = x^{\theta/2}}$, ${\theta}$ is the level of distribution (${\theta=1/2-}$ if relying on Bombieri-Vinogradov, ${\theta=1-}$ if assuming Elliott-Halberstam, and (in principle) ${\theta = \frac{1}{2} + \frac{13}{540}-}$ if using Polymath8a technology), and ${f: [0,+\infty)^k \rightarrow {\bf R}}$ is a smooth, compactly supported function. Most of the progress has come by enlarging the class of cutoff functions ${f}$ one is permitted to use.

The baseline bounds for the numerator and denominator in (1) (as established for instance in this previous post) are as follows. If ${f}$ is supported on the simplex

$\displaystyle {\cal R}_k := \{ (t_1,\dots,t_k) \in [0,+\infty)^k: t_1+\dots+t_k < 1 \},$

and we define the mixed partial derivative ${F: [0,+\infty)^k \rightarrow {\bf R}}$ by

$\displaystyle F(t_1,\dots,t_k) = \frac{\partial^k}{\partial t_1 \dots \partial t_k} f(t_1,\dots,t_k)$

then the denominator in (1) is

$\displaystyle \frac{Bx}{W} (I_k(F) + o(1)) \ \ \ \ \ (2)$

where

$\displaystyle B := (\frac{W}{\phi(W) \log R})^k$

and

$\displaystyle I_k(F) := \int_{[0,+\infty)^k} F(t_1,\dots,t_k)^2\ dt_1 \dots dt_k.$

Similarly, the numerator of (1) is

$\displaystyle \frac{Bx}{W} \frac{2}{\theta} (\sum_{j=1}^m J^{(m)}_k(F) + o(1)) \ \ \ \ \ (3)$

where

$\displaystyle J_k^{(m)}(F) := \int_{[0,+\infty)^{k-1}} (\int_0^\infty F(t_1,\ldots,t_k)\ dt_m)^2\ dt_1 \dots dt_{m-1} dt_{m+1} \dots dt_k.$

Thus, if we let ${M_k}$ be the supremum of the ratio

$\displaystyle \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}$

whenever ${F}$ is supported on ${{\cal R}_k}$ and is non-vanishing, then one can prove ${H_m \leq h_k - h_1}$ whenever

$\displaystyle M_k > \frac{2m}{\theta}.$

We can improve this baseline in a number of ways. Firstly, with regards to the denominator in (1), if one upgrades the Elliott-Halberstam hypothesis ${EH[\theta]}$ to the generalised Elliott-Halberstam hypothesis ${GEH[\theta]}$ (currently known for ${\theta < 1/2}$, thanks to Motohashi, but conjectured for ${\theta < 1}$), the asymptotic (2) holds under the more general hypothesis that ${F}$ is supported in a polytope ${R}$, as long as ${R}$ obeys the inclusion

$\displaystyle R + R \subset \bigcup_{m=1}^k \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: \ \ \ \ \ (4)$

$\displaystyle t_1+\dots+t_{m-1}+t_{m+1}+\dots+t_k < 2; t_m < 2/\theta \} \cup \frac{2}{\theta} \cdot {\cal R}_k;$

examples of polytopes ${R}$ obeying this constraint include the modified simplex

$\displaystyle {\cal R}'_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\dots+t_{m-1}+t_{m+1}+\dots+t_k < 1$

$\displaystyle \hbox{ for all } 1 \leq m \leq k \},$

the prism

$\displaystyle {\cal R}_{k-1} \times [0, 1/\theta)$

the dilated simplex

$\displaystyle \frac{1}{\theta} \cdot {\cal R}_k$

and the truncated simplex

$\displaystyle \frac{k}{k-1} \cdot {\cal R}_k \cap [0,1/\theta)^k.$

See this previous post for a proof of these claims.

With regards to the numerator, the asymptotic (3) is valid whenever, for each ${1 \leq m \leq k}$, the marginals ${\int_0^\infty F(t_1,\ldots,t_k)\ dt_m}$ vanish outside of ${{\cal R}_{k-1}}$. This is automatic if ${F}$ is supported on ${{\cal R}_k}$, or on the slightly larger region ${{\cal R}'_k}$, but is an additional constraint when ${F}$ is supported on one of the other polytopes ${R}$ mentioned above.

More recently, we have obtained a more flexible version of the above asymptotic: if the marginals ${\int_0^\infty F(t_1,\ldots,t_k)\ dt_m}$ vanish outside of ${(1+\varepsilon) \cdot {\cal R}_{k-1}}$ for some ${0 < \varepsilon < 1}$, then the numerator of (1) has a lower bound of

$\displaystyle \frac{Bx}{W} \frac{2}{\theta} (\sum_{j=1}^m J^{(m)}_{k,\varepsilon}(F) + o(1))$

where

$\displaystyle J_{k,\varepsilon}^{(m)}(F) := \int_{(1-\varepsilon) \cdot {\cal R}_{k-1}} (\int_0^\infty F(t_1,\ldots,t_k)\ dt_m)^2\ dt_1 \dots dt_{m-1} dt_{m+1} \dots dt_k.$

A proof is given here. Putting all this together, we can conclude

Theorem 1 Suppose we can find ${0 \leq \varepsilon < 1}$ and a function ${F}$ supported on a polytope ${R}$ obeying (4), not identically zero and with all marginals ${\int_0^\infty F(t_1,\ldots,t_k)\ dt_m}$ vanishing outside of ${(1+\varepsilon) \cdot {\cal R}_{k-1}}$, and with

$\displaystyle \frac{\sum_{m=1}^k J_{k,\varepsilon}^{(m)}(F)}{I_k(F)} > \frac{2m}{\theta}.$

Then ${GEH[\theta]}$ implies ${H_m \leq h_k-h_1}$.

In principle, this very flexible criterion for upper bounding ${H_m}$ should lead to better bounds than before, and in particular we have now established ${H_1 \leq 8}$ on GEH.

Another promising direction is to try to improve the analysis at medium ${k}$ (more specifically, in the regime ${k \sim 50}$), which is where we are currently at without EH or GEH through numerical quadratic programming. Right now we are only using ${\theta=1/2}$ and using the baseline ${M_k}$ analysis, basically for two reasons:

• We do not have good numerical formulae for integrating polynomials on any region more complicated than the simplex ${{\cal R}_k}$ in medium dimension.
• The estimates ${MPZ^{(i)}[\varpi,\delta]}$ produced by Polymath8a involve a ${\delta}$ parameter, which introduces additional restrictions on the support of ${F}$ (conservatively, it restricts ${F}$ to ${[0,\delta']^k}$ where ${\delta' := \frac{\delta}{1/4+\varpi}}$ and ${\theta = 1/2 + 2 \varpi}$; it should be possible to be looser than this (as was done in Polymath8a) but this has not been fully explored yet). This then triggers the previous obstacle of having to integrate on something other than a simplex.

However, these look like solvable problems, and so I would expect that further unconditional improvement for ${H_1}$ should be possible.