;

the previous thread may be found here.

Numerical progress on these bounds have slowed in recent months, although we have very recently lowered the unconditional bound on from 252 to 246 (see the wiki page for more detailed results). While there may still be scope for further improvement (particularly with respect to bounds for with , which we have not focused on for a while, it looks like we have reached the point of diminishing returns, and it is time to turn to the task of writing up the results.

A draft version of the paper so far may be found here (with the directory of source files here). Currently, the introduction and the sieve-theoretic portions of the paper are written up, although the sieve-theoretic arguments are surprisingly lengthy, and some simplification (or other reorganisation) may well be possible. Other portions of the paper that have not yet been written up include the asymptotic analysis of for large k (leading in particular to results for m=2,3,4,5), and a description of the quadratic programming that is used to estimate for small and medium k. Also we will eventually need an appendix to summarise the material from Polymath8a that we would use to generate various narrow admissible tuples.

One issue here is that our current unconditional bounds on for m=2,3,4,5 rely on a distributional estimate on the primes which we believed to be true in Polymath8a, but never actually worked out (among other things, there was some delicate algebraic geometry issues concerning the vanishing of certain cohomology groups that was never resolved). This issue does not affect the m=1 calculations, which only use the Bombieri-Vinogradov theorem or else assume the generalised Elliott-Halberstam conjecture. As such, we will have to rework the computations for these , given that the task of trying to attain the conjectured distributional estimate on the primes would be a significant amount of work that is rather disjoint from the rest of the Polymath8b writeup. One could simply dust off the old maple code for this (e.g. one could tweak the code here, with the constraint 1080*varpi/13+ 330*delta/13<1 being replaced by 600*varpi/7+180*delta/7<1), but there is also a chance that our asymptotic bounds for (currently given in messy detail here) could be sharpened. I plan to look at this issue fairly soon.

Also, there are a number of smaller observations (e.g. the parity problem barrier that prevents us from ever getting a better bound on than 6) that should also go into the paper at some point; the current outline of the paper as given in the draft is not necessarily comprehensive.

Filed under: polymath ]]>

The surface is said to be ruled if, for a Zariski open dense set of points , there exists a line through for some non-zero which is completely contained in , thus

for all . Also, a point is said to be a flecnode if there exists a line through for some non-zero which is tangent to to third order, in the sense that

for . Clearly, if is a ruled surface, then a Zariski open dense set of points on are a flecnode. We then have the remarkable theorem of Cayley and Salmon asserting the converse:

Theorem 1 (Cayley-Salmon theorem)Let be an irreducible polynomial with non-empty. Suppose that a Zariski dense set of points in are flecnodes. Then is a ruled surface.

Among other things, this theorem was used in the celebrated result of Guth and Katz that almost solved the Erdos distance problem in two dimensions, as discussed in this previous blog post. Vanishing to third order is necessary: observe that in a surface of negative curvature, such as the saddle , every point on the surface is tangent to second order to a line (the line in the direction for which the second fundamental form vanishes).

The original proof of the Cayley-Salmon theorem, dating back to at least 1915, is not easily accessible and not written in modern language. A modern proof of this theorem (together with substantial generalisations, for instance to higher dimensions) is given by Landsberg; the proof uses the machinery of modern algebraic geometry. The purpose of this post is to record an alternate proof of the Cayley-Salmon theorem based on classical differential geometry (in particular, the notion of torsion of a curve) and basic ODE methods (in particular, Gronwall’s inequality and the Picard existence theorem). The idea is to “integrate” the lines indicated by the flecnode to produce smooth curves on the surface ; one then uses the vanishing (1) and some basic calculus to conclude that these curves have zero torsion and are thus planar curves. Some further manipulation using (1) (now just to second order instead of third) then shows that these curves are in fact straight lines, giving the ruling on the surface.

Update: Janos Kollar has informed me that the above theorem was essentially known to Monge in 1809; see his recent arXiv note for more details.

I thank Larry Guth and Micha Sharir for conversations leading to this post.

** — 1. Proof — **

Let denote the smooth points of , then is a smooth surface that is a Zariski open dense subset of , and hence Zariski dense in . We consider the projective tangent bundle of ; this is a smooth three-dimensional manifold, which is a bundle of copies of the projective line over , with elements consisting of a point in and the projective class of a direction that is tangent to at and is non-zero. Since and are both irreducible varieties, it is easy to see that is also an irreducible variety.

Inside , we consider the subset of points which obey the flecnode condition (1) for . By hypothesis, the projection of to is Zariski dense. On the other hand, is clearly an algebraic set. Thus the dimension of is at least , and there is at least one component whose projection to is two-dimensional (i.e. is dominant). In particular we can find an irreducible algebraic surface in whose projection to is open dense (not just in the Zariski sense, but also in the differential geometry sense). By removing the singular points of , we may assume that is a smooth surface.

We now claim that the projection map is generically a local diffeomorphism, thus has full rank for a Zariski dense set of points in . This is a simple consequence of Sard’s theorem, but for our purposes it is also instructive to see an ODE proof: if fails to have full rank generically, then it must have rank one generically or rank zero generically. If it has rank one generically, one can use the Picard existence theorem to locally foliate an open dense subset of by curves with the property that for each , the derivative lies in the kernel of , so that if we write , then for all , and so is constant; thus the curves each lie in a single fibre of . This locally describes as a one-dimensional smooth family of curves inside the fibre of , and so the image is locally one-dimensional, contradicting the two-dimensional nature of . A similar argument works when has rank zero generically.

Since is a local diffeomorphism generically, we may apply the inverse function theorem to conclude that on an open dense subset of , we can locally invert this map, which in particular gives *smooth* local maps from open subsets of to unit tangent vectors at such that the flecnode condition (1) is satisfied for all such and .

By the Picard existence theorem, we may thus locally foliate by curves with the property that

for all ; thus has unit speed and is always tangent to a flecnode direction. Thus, by (1) we have

for . Expanding this out in coordinates by the chain rule (and using the usual summation conventions), using to denote the components of , and to denote the first partial derivatives of for , to denote the second partial derivatives, and so forth, we have

We can obtain further differential equations by differentiating the above equations in . For instance, if we differentiate (3) in we obtain

and hence by (4)

Similarly, if we differentiate (4) in we obtain

and hence by (5)

Finally, if we differentiate (6) in we obtain

and hence by (7)

The equations (3), (6), (8) have a simple geometric interpretation: the first three derivatives are all orthogonal to the gradient . Generically, this gradient is non-zero, and we are in three dimensions, so we conclude that are always coplanar. Equivalently, the torsion of the curve vanishes, and hence the curve is necessarily planar (locally, at least). Another way to see this is to start with the identity

where is the cross product, and conclude that is a scalar multiple of whenever it is non-vanishing, which by Gronwall’s inequality shows that has fixed orientation whenever it is non-vanishing.

So there is a plane in in which locally lies. If vanished on this plane, then , being irreducible, would be just and we would be done, so we may assume that is non-vanishing here, thus is at most one-dimensional. On the other hand, (3), (6) show that are both orthogonal to the gradient of restricted to , which is generically non-zero; as we now only have two dimensions, this implies that are parallel. Thus the curvature of now also vanishes, which implies that is a straight line. Hence we have locally foliated at least a small open neighbourhood in by straight lines, which ensures that is ruled as desired.

Filed under: expository, math.AG, math.DG Tagged: Cayley-Salmon theorem, flecnode, ruled surface ]]>

and difference sets

as well as iterated sumsets such as , , and so forth. Here, are finite non-empty subsets of some additive group (classically one took or , but nowadays one usually considers more general additive groups). Some basic estimates in this vein are the following:

Lemma 1 (Ruzsa covering lemma)Let be finite non-empty subsets of . Then may be covered by at most translates of .

*Proof:* Consider a maximal set of disjoint translates of by elements . These translates have cardinality , are disjoint, and lie in , so there are at most of them. By maximality, for any , must intersect at least one of the selected , thus , and the claim follows.

Lemma 2 (Ruzsa triangle inequality)Let be finite non-empty subsets of . Then .

*Proof:* Consider the addition map from to . Every element of has a preimage of this map of cardinality at least , thanks to the obvious identity for each . Since has cardinality , the claim follows.

Such estimates (which are covered, incidentally, in Section 2 of my book with Van Vu) are particularly useful for controlling finite sets of small doubling, in the sense that for some bounded . (There are deeper theorems, most notably Freiman’s theorem, which give more control than what elementary Ruzsa calculus does, however the known bounds in the latter theorem are worse than polynomial in (although it is conjectured otherwise), whereas the elementary estimates are almost all polynomial in .)

However, there are some settings in which the standard sum set estimates are not quite applicable. One such setting is the continuous setting, where one is dealing with bounded open sets in an additive Lie group (e.g. or a torus ) rather than a finite setting. Here, one can largely replicate the discrete sum set estimates by working with a Haar measure in place of cardinality; this is the approach taken for instance in this paper of mine. However, there is another setting, which one might dub the “discretised” setting (as opposed to the “discrete” setting or “continuous” setting), in which the sets remain finite (or at least discretisable to be finite), but for which there is a certain amount of “roundoff error” coming from the discretisation. As a typical example (working now in a non-commutative multiplicative setting rather than an additive one), consider the orthogonal group of orthogonal matrices, and let be the matrices obtained by starting with all of the orthogonal matrice in and rounding each coefficient of each matrix in this set to the nearest multiple of , for some small . This forms a finite set (whose cardinality grows as like a certain negative power of ). In the limit , the set is not a set of small doubling in the discrete sense. However, is still close to in a metric sense, being contained in the -neighbourhood of . Another key example comes from graphs of maps from a subset of one additive group to another . If is “approximately additive” in the sense that for all , is close to in some metric, then might not have small doubling in the discrete sense (because could take a large number of values), but could be considered a set of small doubling in a discretised sense.

One would like to have a sum set (or product set) theory that can handle these cases, particularly in “high-dimensional” settings in which the standard methods of passing back and forth between continuous, discrete, or discretised settings behave poorly from a quantitative point of view due to the exponentially large doubling constant of balls. One way to do this is to impose a translation invariant metric on the underlying group (reverting back to additive notation), and replace the notion of cardinality by that of metric entropy. There are a number of almost equivalent ways to define this concept:

Definition 3Let be a metric space, let be a subset of , and let be a radius.

- The
packing numberis the largest number of points one can pack inside such that the balls are disjoint.- The
internal covering numberis the fewest number of points such that the balls cover .- The
external covering numberis the fewest number of points such that the balls cover .- The
metric entropyis the largest number of points one can find in that are -separated, thus for all .

It is an easy exercise to verify the inequalities

for any , and that is non-increasing in and non-decreasing in for the three choices (but monotonicity in can fail for !). It turns out that the external covering number is slightly more convenient than the other notions of metric entropy, so we will abbreviate . The cardinality can be viewed as the limit of the entropies as .

If we have the bounded doubling property that is covered by translates of for each , and one has a Haar measure on which assigns a positive finite mass to each ball, then any of the above entropies is comparable to , as can be seen by simple volume packing arguments. Thus in the bounded doubling setting one can usually use the measure-theoretic sum set theory to derive entropy-theoretic sumset bounds (see e.g. this paper of mine for an example of this). However, it turns out that even in the absence of bounded doubling, one still has an entropy analogue of most of the elementary sum set theory, except that one has to accept some degradation in the radius parameter by some absolute constant. Such losses can be acceptable in applications in which the underlying sets are largely “transverse” to the balls , so that the -entropy of is largely independent of ; this is a situation which arises in particular in the case of graphs discussed above, if one works with “vertical” metrics whose balls extend primarily in the vertical direction. (I hope to present a specific application of this type here in the near future.)

Henceforth we work in an additive group equipped with a translation-invariant metric . (One can also generalise things slightly by allowing the metric to attain the values or , without changing much of the analysis below.) By the Heine-Borel theorem, any precompact set will have finite entropy for any . We now have analogues of the two basic Ruzsa lemmas above:

Lemma 4 (Ruzsa covering lemma)Let be precompact non-empty subsets of , and let . Then may be covered by at most translates of .

*Proof:* Let be a maximal set of points such that the sets are all disjoint. Then the sets are disjoint in and have entropy , and furthermore any ball of radius can intersect at most one of the . We conclude that , so . If , then must intersect one of the , so , and the claim follows.

Lemma 5 (Ruzsa triangle inequality)Let be precompact non-empty subsets of , and let . Then .

*Proof:* Consider the addition map from to . The domain may be covered by product balls . Every element of has a preimage of this map which projects to a translate of , and thus must meet at least of these product balls. However, if two elements of are separated by a distance of at least , then no product ball can intersect both preimages. We thus see that , and the claim follows.

Below the fold we will record some further metric entropy analogues of sum set estimates (basically redoing much of Chapter 2 of my book with Van Vu). Unfortunately there does not seem to be a direct way to abstractly deduce metric entropy results from their sum set analogues (basically due to the failure of a certain strong version of Freiman’s theorem, as discussed in this previous post); nevertheless, the proofs of the discrete arguments are elementary enough that they can be modified with a small amount of effort to handle the entropy case. (In fact, there should be a very general model-theoretic framework in which both the discrete and entropy arguments can be processed in a unified manner; see this paper of Hrushovski for one such framework.)

It is also likely that many of the arguments here extend to the non-commutative setting, but for simplicity we will not pursue such generalisations here.

** — 1. Approximate groups — **

In discrete sum set theory, a key concept is that of a *-approximate group* – a finite symmetric subset of containing the origin such that is covered by at most translates of . The analogous concept here will be that of a *-approximate group*: a precompact symmetric subset of containing the origin such that is covered by at most copies of . Such sets obey good iterated doubling properties; for instance, is covered by at most copies of for any . They can be generated from sets of small tripling:

Lemma 6Let be a precompact non-empty subset of , and let . If or , then is a -approximate group.

*Proof:* From Lemma 5 we have

(for an appropriate choice of sign) and

and thus by Lemma 4, may be covered by at most copies of , giving the claim.

** — 2. From small doubling to small tripling or quadrupling — **

As we saw above, Lemma 5 and Lemma 4 are already very powerful once one has some sort of control on triple or higher sums such as , , or . But if only controls a double sum such as or , it is a bit trickier to proceed. Here is one estimate (somewhat analogous to Proposition 2.18 from my book with Van Vu, but with slightly worse numerology):

Lemma 7Let be a precompact non-empty subset of , and let . If , then .

One can combine this lemma with Lemma 5 to obtain similar conclusions starting with a hypothesis on rather than ; we leave this to the interested reader. Of course, the conclusion can also be combined with Lemma 6; we again leave this as an exercise.

*Proof:* Write . Then , so we may find a -separated subset of with . By hypothesis, we may cover by balls with . Call a centre *popular* if contains at least differences with (counting multiplicity), and let denote the set of popular centres. Then at most of the pairs have lying in an unpopular ball , thus we have for at least pairs in . Thus, by the pigeonhole principle, there exists such that at least elements of lie in . Thus

and thus

Next, for any , we consider the set of pairs such that . We may write

for some and . By definition of , we can find distinct pairs for with such that for all . As is -separated and has diameter at most , the must be distinct in , and similarly for the . We then have

for . Each lies in and is thus lies in for some , and similarly for some . Then

and so . Also, since and the are -separated, we see that the are distinct as varies. We conclude that . On the other hand, the total number of pairs is , and any two -separated points in generate disjoint sets . We conclude that there can be at most -separated points in , thus

and thus

By Lemma 5 and (1), we conclude that

and the claim follows.

** — 3. The Balog-Szemeredi-Gowers lemma — **

One of the most difficult, but powerful, components of the elementary sum set theory is the tool now known as the *Balog-Szemerédi-Gowers lemma*, which converts control on partial sumsets (or equivalently, lower bounds on “additive energy”) to control on total sumsets, after suitable refinements of the sets. Here is one metric entropy version of this lemma.

Lemma 8 (Balog-Szemerédi-Gowers)Let and , and let be precompact subsets of . Suppose thatand

where we endow with the sup norm metric. Then there exist subsets , of respectively with

and

Again, this lemma may be usefully combined with the previous sum set estimates, much as was done in my book with Van Vu; I leave the details to the interested reader.

*Proof:* Let , be maximal -separated subsets of respectively, thus , and similarly .

By hypothesis, we can find at least quadruples which are -separated in , such that

for all such quadruples. By construction of , each such quadruple can be associated to a nearby quadruple with

and thus by the triangle inequality

Also by the triangle inequality we see that each can be associated to at most one of the quadruples , and as the are -separated, the are -separated. We conclude that there is a set of at least quadruples in obeying (2) that are -separated.

Call a pair *popular* if there are at least of the above quadruples in obeying (2) with the indicated first two coefficients. The unpopular pairs absorb at most of the quadruples, so at least of the quadruples are associated to popular pairs . On the other hand, as the quadruples are -separated, we see from (2) and the triangle inequality that for each there is at most one giving rise to a quadruple . Thus each can be associated to at most quadruples, and we conclude that the set of popular pairs has size at least . In particular this shows that .

We now apply the graph-theoretic Balog-Szemerédi-Gowers lemma (see Corollary 6.19 of my book with Van Vu) to conclude that there exists a subset of and of with

such that for every and there exist pairs such that lie in . Since was already -separated, we conclude that

Now fix and , and let be one of the above pairs. As is popular, we can thus find pairs such that

furthermore, the lie in an -separated set. Similarly, we can find pairs and pairs such that

with the and also lying in an -separated set. In particular, we see that given , uniquely determine , and uniquely determine , so a single sextuple can arise from at most one pair ; in particular, we see that sextuples are associated to each pair . Taking alternating combinations of (3), (4), (5) we see that

In particular, if and are two pairs in with at least apart, then a single sextuple can be associated to at most one of these pairs. Since the number of sextuples is at most , we conclude that there are at most pairs with -separated differences , thus as required.

Filed under: expository, math.CO, math.MG Tagged: additive combinatorics, metric entropy, sum set estimates ]]>

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with *Lagrangian space* and *Eulerian space* , we let be the space of volume-preserving, orientation-preserving maps from Lagrangian space to Eulerian space. Given a curve , we can define the *Lagrangian velocity field* as the time derivative of , and the *Eulerian velocity field* . The volume-preserving nature of ensures that is a divergence-free vector field:

If we formally define the functional

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

for some pressure field . As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

the rigid motion symmetries of Eulerian space give conservation of the total momentum

and total angular momentum

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

for any closed loop in , or equivalently pointwise conservation of the Lagrangian vorticity , where is the -form associated with the vector field using the Euclidean metric on , with denoting pullback by .

It turns out that one can generalise the above calculations. Given any self-adjoint operator on divergence-free vector fields , we can define the functional

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

for some pressure field , where in coordinates is with the usual summation conventions. (When , , and this term can be absorbed into the pressure , and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

If the operator commutes with rigid motions on , then we have conservation of total momentum

and total angular momentum

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

or pointwise conservation of the Lagrangian vorticity . These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions , when is the inverse derivative . The vorticity is a -form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write , then

Since is also divergence-free, we may therefore write

where the stream function is given by the formula

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As commutes with rigid motions, we also have (formally, at least) conservation of momentum

(which up to trivial transformations is also expressible in impulse form as , after integration by parts), and conservation of angular momentum

(which up to trivial transformations is ). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity , thus is transported by the flow (which is also evident from (3). In particular, all integrals of the form for a fixed function are conserved by the flow.

** — 1. Euler-Lagrange calculations — **

We now justify the claim that stationary points of the functional obey (2). We consider continuous deformations of the critical point , thus now depends on both and . We already have the Eulerian velocity field , which is related to the derivative of by the formula

similarly we may introduce a deformation field by

The vector field is divergence free and has to obey appropriate vanishing conditions at infinity, but is otherwise unconstrained. If we compute using the above two equations and the chain rule, we arrive at the “zero-curvature” condition

On the other hand, as is a critical point, we have

when . Differentiating under the integral sign and using the self-adjoint nature of , the left-hand side is

Inserting (4) and integrating by parts (and using the divergence-free nature of ), this expression can be rewritten as

Since is essentially an arbitrary divergence-free vector field, the expression inside parentheses must vanish, and the equation (2) follows.

Filed under: expository, math.AP, math.MP Tagged: Euler equations, Noether's theorem, surface quasi-geostrophic equation ]]>

It is a remarkable fact in the theory of differential equations that many of the ordinary and partial differential equations that are of interest (particularly in geometric PDE, or PDE arising from mathematical physics) admit a variational formulation; thus, a collection of one or more fields on a domain taking values in a space will solve the differential equation of interest if and only if is a critical point to the functional

involving the fields and their first derivatives , where the Lagrangian is a function on the vector bundle over consisting of triples with , , and a linear transformation; we also usually keep the boundary data of fixed in case has a non-trivial boundary, although we will ignore these issues here. (We also ignore the possibility of having additional constraints imposed on and , which require the machinery of Lagrange multipliers to deal with, but which will only serve as a distraction for the current discussion.) It is common to use local coordinates to parameterise as and as , in which case can be viewed locally as a function on .

Example 1 (Geodesic flow)Take and to be a Riemannian manifold, which we will write locally in coordinates as with metric for . A geodesic is then a critical point (keeping fixed) of the energy functionalor in coordinates (ignoring coordinate patch issues, and using the usual summation conventions)

As discussed in this previous post, both the Euler equations for rigid body motion, and the Euler equations for incompressible inviscid flow, can be interpreted as geodesic flow (though in the latter case, one has to work

reallyformally, as the manifold is now infinite dimensional).More generally, if is itself a Riemannian manifold, which we write locally in coordinates as with metric for , then a harmonic map is a critical point of the energy functional

or in coordinates (again ignoring coordinate patch issues)

If we replace the Riemannian manifold by a Lorentzian manifold, such as Minkowski space , then the notion of a harmonic map is replaced by that of a wave map, which generalises the scalar wave equation (which corresponds to the case ).

Example 2 (-particle interactions)Take and ; then a function can be interpreted as a collection of trajectories in space, which we give a physical interpretation as the trajectories of particles. If we assign each particle a positive mass , and also introduce a potential energy function , then it turns out that Newton’s laws of motion in this context (with the force on the particle being given by the conservative force ) are equivalent to the trajectories being a critical point of the action functional

Formally, if is a critical point of a functional , this means that

whenever is a (smooth) deformation with (and with respecting whatever boundary conditions are appropriate). Interchanging the derivative and integral, we (formally, at least) arrive at

Write for the infinitesimal deformation of . By the chain rule, can be expressed in terms of . In coordinates, we have

where we parameterise by , and we use subscripts on to denote partial derivatives in the various coefficients. (One can of course work in a coordinate-free manner here if one really wants to, but the notation becomes a little cumbersome due to the need to carefully split up the tangent space of , and we will not do so here.) Thus we can view (2) as an integral identity that asserts the vanishing of a certain integral, whose integrand involves , where vanishes at the boundary but is otherwise unconstrained.

A general rule of thumb in PDE and calculus of variations is that whenever one has an integral identity of the form for some class of functions that vanishes on the boundary, then there must be an associated differential identity that justifies this integral identity through Stokes’ theorem. This rule of thumb helps explain why integration by parts is used so frequently in PDE to justify integral identities. The rule of thumb can fail when one is dealing with “global” or “cohomologically non-trivial” integral identities of a topological nature, such as the Gauss-Bonnet or Kazhdan-Warner identities, but is quite reliable for “local” or “cohomologically trivial” identities, such as those arising from calculus of variations.

In any case, if we apply this rule to (2), we expect that the integrand should be expressible as a spatial divergence. This is indeed the case:

Proposition 1(Formal) Let be a critical point of the functional defined in (1). Then for any deformation with , we havewhere is the vector field that is expressible in coordinates as

*Proof:* Comparing (4) with (3), we see that the claim is equivalent to the Euler-Lagrange equation

The same computation, together with an integration by parts, shows that (2) may be rewritten as

Since is unconstrained on the interior of , the claim (6) follows (at a formal level, at least).

Many variational problems also enjoy one-parameter continuous *symmetries*: given any field (not necessarily a critical point), one can place that field in a one-parameter family with , such that

for all ; in particular,

which can be written as (2) as before. Applying the previous rule of thumb, we thus expect another divergence identity

whenever arises from a continuous one-parameter symmetry. This expectation is indeed the case in many examples. For instance, if the spatial domain is the Euclidean space , and the Lagrangian (when expressed in coordinates) has no direct dependence on the spatial variable , thus

then we obtain translation symmetries

for , where is the standard basis for . For a fixed , the left-hand side of (7) then becomes

where . Another common type of symmetry is a *pointwise* symmetry, in which

for all , in which case (7) clearly holds with .

If we subtract (4) from (7), we obtain the celebrated theorem of Noether linking symmetries with conservation laws:

Theorem 2 (Noether’s theorem)Suppose that is a critical point of the functional (1), and let be a one-parameter continuous symmetry with . Let be the vector field in (5), and let be the vector field in (7). Then we have the pointwise conservation law

In particular, for one-dimensional variational problems, in which , we have the conservation law for all (assuming of course that is connected and contains ).

Noether’s theorem gives a systematic way to locate conservation laws for solutions to variational problems. For instance, if and the Lagrangian has no explicit time dependence, thus

then by using the time translation symmetry , we have

as discussed previously, whereas we have , and hence by (5)

and so Noether’s theorem gives conservation of the *Hamiltonian*

For instance, for geodesic flow, the Hamiltonian works out to be

so we see that the speed of the geodesic is conserved over time.

For pointwise symmetries (9), vanishes, and so Noether’s theorem simplifies to ; in the one-dimensional case , we thus see from (5) that the quantity

is conserved in time. For instance, for the -particle system in Example 2, if we have the translation invariance

for all , then we have the pointwise translation symmetry

for all , and some , in which case , and the conserved quantity (11) becomes

as was arbitrary, this establishes conservation of the *total momentum*

Similarly, if we have the rotation invariance

for any and , then we have the pointwise rotation symmetry

for any skew-symmetric real matrix , in which case , and the conserved quantity (11) becomes

since is an arbitrary skew-symmetric matrix, this establishes conservation of the *total angular momentum*

Below the fold, I will describe how Noether’s theorem can be used to locate all of the conserved quantities for the Euler equations of inviscid fluid flow, discussed in this previous post, by interpreting that flow as geodesic flow in an infinite dimensional manifold.

** — 1. Euler’s equations — **

The geometric setup for the geodesic interpretation of Euler’s equations of fluid flow is as follows. We will need two copies and of Euclidean space , with two different structures. Firstly, we will have Lagrangian space

which is (viewed as a smooth manifold), together with the standard volume form . This space should be thought of as the space of “labels” of the particles of the fluid, and its coordinates are known as *Lagrangian coordinates*. The symmetry group of this space is the space of orientation-preserving and volume-preserving diffeomorphisms.

Secondly, we will need the *Eulerian space*

which is the smooth manifold together with the Euclidean metric and the standard volume form . This space is the physical space of “positions” of the particles of the fluid. The symmetry space of this space is the space of orientation-preserving rigid motions of Euclidean space.

Let be the space of diffeomorphisms from Lagrangian space to Eulerian space that preserve volume and orientation; this can be viewed as an infinite-dimensional manifold. A single element of describes the positions of an incompressible fluid at a snapshot in time; an incompressible fluid flow is then described by a curve . The time derivative of such a curve can be viewed in Eulerian coordinates as the *velocity field*

If one then defines the Lagrangian

then one can show that the critical points of this Lagrangian (formally) correspond to solutions to the Euler equations using the correspondence (12): see this previous blog post for details.

Applying a volume-preserving change of coordinates, the Lagrangian can also be expressed as

There are then three types of symmetries that are evident for this Lagrangian: time symmetry; symmetry on the Eulerian space; and symmetry on the Lagrangian space.

We begin with time symmetry, , which comes from the fact that the Lagrangian does not depend explicitly on the time variable . As discussed before, this gives conservation of the Hamiltonian (10), which (formally, at least) becomes

thus giving the familiar energy conservation law for the Euler equations.

Now we use the symmetry group that acts on the Eulerian space , and hence on fluid flows . This is a pointwise symmetry of the Lagrangian, and formally gives conservation of total momentum

and total angular momentum

in exact analogy with the situation with the -body system. (Indeed, one can formally view the Euler equations as an limit of a certain family of -body systems, which is indeed how these equations are physically derived.)

Finally, we consider symmetries on the Lagrangian space . Any divergence-free vector field on gives a one-parameter group of volume-preserving, orientation-preserving diffeomorphisms on , which then act on fluid flows by the formula

This is a pointwise symmetry of the Lagrangian, with infinitesimal derivative

Applying (11), we thus (formally) conclude that the quantity

is conserved. We can write this in coordinates as

We can specialise this conservation law by working with specific choices of divergence-free vector field . For instance, suppose we have a closed loop , which we parameterise by unit speed: . For an infinitesimal , we can then create a divergence-free vector field by setting when lies in the (transverse) -neighbourhood of , and zero otherwise. It is geometrically obvious that this field is divergence-free (up to errors of ). The conserved quantity (13) is then equal to

up to lower order terms, so that the quantity

is conserved. Writing for the curve , we see from the chain rule that this is equal to

giving Kelvin’s circulation theorem.

More generally, we can generate a divergence-free vector field from an alternating -vector by taking a further divergence:

Integrating by parts, we can then write (13) as

since is an arbitrary alternating -tensor, we conclude that for each and , the quantity

is conserved in time. Writing and using the chain rule, this becomes

which after interchange of the indices may be rewritten in terms of the vorticity as

giving the pointwise conservation of the pullback of the vorticity in Lagrangian coordinates. (Of course, this is just the differential form of Kelvin’s circulation theorem; it also implies conservation of the vortex stream lines in Lagrangian coordinates.)

Finally, as the vorticity is divergence-free (when viewed as a polar vector field), the pullback is also. If we then set to be the vector field associated to the conserved quantity , the quantity (13) can then be rewritten as the helicity

which is then also conserved; a similar argument gives conservation of the helicity on any set that is the union of stream lines.

Remark 1The above Lagrangian mechanics calculations can also be recast into a Hamiltonian mechanics formalism; see for instance this paper of Olver for a Hamiltonian perspective on the conservation laws for the Euler equations.

Filed under: expository, math.AP, math.CA Tagged: calculus of variations, conservation laws, Euler-Arnold equation, incompressible Euler equations, Noether's theorem, symmetry ]]>

where is the velocity field, and is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that is bounded. We will take the dimension to be at least two, with the three-dimensional case being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the *conservation laws* for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

and (formally, at least) one has conservation of momentum

and angular momentum

(although, as we shall discuss below, due to the slow decay of at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

For reasons which may become clearer later, I will rewrite the Euler equations in the language of Riemannian geometry, in particular, using the abstract index notation of Penrose), and using the Euclidean metric on to raise and lower indices, and to define the covariant derivative through the Levi-Civita connection (which, in Cartesian coordinates, is just the usual partial derivative evaluated componentwise). The velocity field now is written as ; contracting against the metric gives a -form , which I will call the *covelocity*, and also write as . The Euler equations then become

In particular we have

which leads to the conservation of energy (1) upon integrating in space.

In the usual treatment of the Euler equations, it is common to introduce the material derivative

Here, we shall adopt the subtly different (but closely related) approach of using the *material Lie derivative*

where is the Lie derivative along the vector field . For scalar fields , the material Lie derivative is the same as the material derivative:

However, the two notions differ when applied to vector fields or forms, with the material Lie derivative having better covariance properties than the material derivative. When applied to vector fields , we have

and so

Similarly, for -forms , we have

and similarly for -forms we have

leading to similar formulae comparing and for forms.

Since , the material Lie derivative of the velocity field is just the time derivative:

The material Lie derivative of the *covelocity* field is however more interesting:

In particular, we see that the material Lie derivative of the covelocity is a gradient:

Since the integral of a gradient along any closed loop is zero, we obtain

Theorem 1 (Kelvin’s circulation theorem)Let be a time-dependent loop in which is transported by the flow (thus for any scalar function ). Then

Now we take an exterior derivative of the covelocity to obtain the *vorticity*

In abstract index notation, is the -form

As exterior derivatives commute with diffeomorphisms, they also commute with Lie derivatives, so in particular

Since was a gradient, its exterior derivative vanishes, and we thus have *transport of vorticity*:

(This fact was also interpreted as conservation of exterior momentum in this previous blog post.) This fact also follows from Kelvin’s circulation theorem, after first applying Stokes’ theorem to rewrite as for a spanning surface that is transported by the flow.

If we let be the usual volume -form on , then the divergence-free nature of (and the time-independent nature of ) implies that is also transported by the flow:

If we thus define the *polar vorticity* to be the -vector that is the Hodge star of with respect to this volume form, thus

for all -forms , then we see from (5), (6) that the polar vorticity is also transported by the flow:

In two dimensions , the polar vorticity is just a scalar, which by abuse of notation is also denoted (in coordinates, ), and (7) becomes the well-known transport of scalar vorticity:

In three dimensions , is a vector field which by abuse of notation is also denoted (in coordinates, ), and (7) becomes the well-known vorticity equation:

From (7) we also see that the vortex lines are transported by the flow; in fact we have the stronger statement that if is transported by the flow and obeys

at the initial time , then it continues to do so at all later times .

In three dimensions, we may contract the polar vorticity against the covelocity to obtain a scalar . We may then combine (7) and (4) to obtain

Now the exterior derivative of vanishes, so that is divergence-free, and so annihilates . We therefore conclude conservation of helicity (2). In fact we conclude the stronger statement that if is any time-dependent region in which is preserved by (i.e. it is the union of vortex lines) and is transported by the flow, then is conserved in time. This is consistent with Kelvin’s circulation theorem, since one can use Fubini’s theorem to compute the integral by first computing the integral of on each of the vortex lines in , and then integrating against on the space of vortex lines in (which is a two-dimensional space on which naturally descends to become an area form. All of these quantities are transported by the flow.

Finally, we consider the conservation of various moments of the velocity and vorticity. Here it is best to return to material derivatives instead of material Lie derivatives , basically because the flow along does not preserve the Euclidean metric or the flat connection , making the interchange of Lie derivatives with the integration of vector-valued quantities a little tricky.

Because we will be considering linear integrals of or rather than quadratic integrals, there can be some difficulty in ensuring absolute integrability of the integrals used; for instance, in three dimensions the Biot-Savart law suggests that could decay as slowly as , even if the vorticity is compactly supported. However, the vorticity transport equation (7) tells us (in any dimension) that if the vorticity is compactly supported at time zero, then it remains compactly supported at later times (with the support being transported by the flow). In practice, this means that we will be able to justify operations such as integration by parts if there is at least one factor of the vorticity present.

We begin with the total vorticity

which is well-defined as a -form thanks to the flat connection. Formally, if we write and integrate by parts, this vorticity should vanish; however if has slow decay then this is not necessarily the case. For instance, if is a smooth mollification of the 2D Biot-Savart kernel then the total vorticity is one (times the standard -form). In three dimensions, though, there is a trick that allows one to establish vanishing of the total polar vorticity

and hence also the total vorticity. Namely, if is the scaling vector field , then

and integration in parts (now involving the compactly supported vorticity ) gives the required vanishing. An application of Fubini’s theorem then shows that the total vorticity also vanishes in four and higher dimensions.

In any dimension, though, the total vorticity (and hence also total polar vorticity) is conserved. Indeed, from (5) and (3) we have

where we have used the vanishing of the exterior derivative of vorticity, as well as the divergence-free nature of . This expresses as a total derivative

giving conservation of total vorticity.

Now we look at total velocity

which (up to a scaling factor representing the density of the incompressible fluid) has the physical interpretation as the total momentum of the fluid. We have

which *formally* suggests that total velocity is conserved. However, in practice usually decays too slowly to justify this calculation, unless one works in a suitable principal value sense. We shall take a different tack, noting that

Thus, *when has enough decay*, one has

however, the right-hand side remains well defined even when decays slowly, assuming that the vorticity is compactly supported. It is thus natural to then define the impulse

in three dimensions, this would be . The above considerations suggest that the impulse should be another conserved quantity, and indeed it is. To see this, we first compute using (8):

and so it will suffice to show that is also a total derivative. But it is:

Finally, we look at the total angular momentum

Again, we have

which as before formally suggests that total angular momentum should be conserved. As with total momentum, in practice the velocity field decays too slowly to justify this calculation, unless one works carefully with principal value integrals (and uses quite precise asymptotics on the decay of at infinity). Once again, one can avoid these technicalities by recasting this quantity in terms of vorticity. Using to denote antisymmetrisation in the indices, we observe that

and so we have

when there is sufficient decay of the velocity field. Again, the right-hand side makes sense whenever the vorticity is compactly supported. If we then define the *moment of impulse*

then we expect this quantity to also be conserved by the flow. This is indeed the case, and can be verified by a rather lengthy calculation similar to that used to establish conservation of impulse; we omit the details here as they are rather tedious and unenlightening, with a key step being the establishment of the fact that is a total derivative, by manipulating the identity (9).

Filed under: expository, math.AP Tagged: conservation laws, Euler equations, helicity, vorticity ]]>

either for small values of (in particular ) or asymptotically as . The previous thread may be found here. The currently best known bounds on can be found at the wiki page.

The focus is now on bounding unconditionally (in particular, without resorting to the Elliott-Halberstam conjecture or its generalisations). We can bound whenever one can find a symmetric square-integrable function supported on the simplex such that

Our strategy for establishing this has been to restrict to be a linear combination of symmetrised monomials (restricted of course to ), where the degree is small; actually, it seems convenient to work with the slightly different basis where the are restricted to be even. The criterion (1) then becomes a large quadratic program with explicit but complicated rational coefficients. This approach has lowered down to , which led to the bound .

Actually, we know that the more general criterion

will suffice, whenever and is supported now on and obeys the vanishing marginal condition whenever . The latter is in particular obeyed when is supported on . A modification of the preceding strategy has lowered slightly to , giving the bound which is currently our best record.

However, the quadratic programs here have become extremely large and slow to run, and more efficient algorithms (or possibly more computer power) may be required to advance further.

Filed under: math.NA, polymath Tagged: polymath8 ]]>

either for small values of (in particular ) or asymptotically as . The previous thread may be found here. The currently best known bounds on can be found at the wiki page.

The big news since the last thread is that we have managed to obtain the (sieve-theoretically) optimal bound of assuming the generalised Elliott-Halberstam conjecture (GEH), which pretty much closes off that part of the story. Unconditionally, our bound on is still . This bound was obtained using the “vanilla” Maynard sieve, in which the cutoff was supported in the original simplex , and only Bombieri-Vinogradov was used. In principle, we can enlarge the sieve support a little bit further now; for instance, we can enlarge to , but then have to shrink the J integrals to , provided that the marginals vanish for . However, we do not yet know how to numerically work with these expanded problems.

Given the substantial progress made so far, it looks like we are close to the point where we should declare victory and write up the results (though we should take one last look to see if there is any room to improve the bounds). There is actually a fair bit to write up:

- Improvements to the Maynard sieve (pushing beyond the simplex, the epsilon trick, and pushing beyond the cube);
- Asymptotic bounds for and hence ;
- Explicit bounds for (using the Polymath8a results)
- ;
- on GEH (and parity obstructions to any further improvement).

I will try to create a skeleton outline of such a paper in the Polymath8 Dropbox folder soon. It shouldn’t be nearly as big as the Polymath8a paper, but it will still be quite sizeable.

Filed under: math.NT, polymath Tagged: polymath8 ]]>

The first purpose is to announce the uploading of the paper “New equidistribution estimates of Zhang type, and bounded gaps between primes” by D.H.J. Polymath, which is the main output of the Polymath8a project on bounded gaps between primes, to the arXiv, and to describe the main results of this paper below the fold.

The second purpose is to roll over the previous thread on all remaining Polymath8a-related matters (e.g. updates on the submission status of the paper) to a fresh thread. (Discussion of the ongoing Polymath8b project is however being kept on a separate thread, to try to reduce confusion.)

The final purpose of this post is to coordinate the writing of a retrospective article on the Polymath8 experience, which has been solicited for the Newsletter of the European Mathematical Society. I suppose that this could encompass both the Polymath8a and Polymath8b projects, even though the second one is still ongoing (but I think we will soon be entering the endgame there). I think there would be two main purposes of such a retrospective article. The first one would be to tell a story about the *process* of conducting mathematical research, rather than just describe the *outcome* of such research; this is an important aspect of the subject which is given almost no attention in most mathematical writing, and it would be good to be able to capture some sense of this process while memories are still relatively fresh. The other would be to draw some tentative conclusions with regards to what the strengths and weaknesses of a Polymath project are, and how appropriate such a format would be for other mathematical problems than bounded gaps between primes. In my opinion, the bounded gaps problem had some fairly unique features that made it particularly amenable to a Polymath project, such as (a) a high level of interest amongst the mathematical community in the problem; (b) a very focused objective (“improve !”), which naturally provided an obvious metric to measure progress; (c) the modular nature of the project, which allowed for people to focus on one aspect of the problem only, and still make contributions to the final goal; and (d) a very reasonable level of ambition (for instance, we did not attempt to prove the twin prime conjecture, which in my opinion would make a terrible Polymath project at our current level of mathematical technology). This is not an exhaustive list of helpful features of the problem; I would welcome other diagnoses of the project by other participants.

With these two objectives in mind, I propose a format for the retrospective article consisting of a brief introduction to the polymath concept in general and the polymath8 project in particular, followed by a collection of essentially independent contributions by different participants on their own experiences and thoughts. Finally we could have a conclusion section in which we make some general remarks on the polymath project (such as the remarks above). I’ve started a dropbox subfolder for this article (currently in a very skeletal outline form only), and will begin writing a section on my own experiences; other participants are of course encouraged to add their own sections (it is probably best to create separate files for these, and then input them into the main file retrospective.tex, to reduce edit conflicts. If there are participants who wish to contribute but do not currently have access to the Dropbox folder, please email me and I will try to have you added (or else you can supply your thoughts by email, or in the comments to this post; we may have a section for shorter miscellaneous comments from more casual participants, for people who don’t wish to write a lengthy essay on the subject).

As for deadlines, the EMS Newsletter would like a submitted article by mid-April in order to make the June issue, but in the worst case, it will just be held over until the issue after that.

** — 1. Description of Polymath8a results — **

Let denote the quantity

where denotes the prime. Thus for instance the notorious twin prime conjecture is equivalent to the claim that . However, even establishing the finite nature of unconditionally was an open problem until the celebrated work of Zhang last year, who established the bound

Zhang’s argument, which built upon earlier work of Goldston, Pintz, and Yildirim, can be summarised as follows. For any natural number , define an *admissible -tuple* to be a tuple of increasing integers, which avoids at least one residue class modulo for each prime . For instance, is an admissible -tuple, but is not. The Hardy-Littlewood prime tuples conjecture asserts that if is an admissible -tuple, then there exists infinitely many such that are simultaneously prime. This conjecture is currently out of reach for any ; for instance, the case when and the tuple is is the twin prime conjecture. However, Zhang was able to prove a weaker claim, which we call , for sufficiently large . Specifically, (following the notation of Pintz) let denote the assertion that given any admissible -tuple , one has infinitely many such that *at least two* of the are prime. It is easy to see that if holds and is an admissible -tuple, then . So to bound , it suffices to show that holds for some , and then find as narrow an admissible -tuple as possible.

Zhang was able to obtain for , and then took the first primes larger than to be the admissible -tuple, observing that this tuple had diameter at most . (Actually, it has diameter , as observed by Trudgian.) The earliest phase of the Polymath8a project consisted of using increasingly sophisticated methods to search for narrow admissible tuples of a given cardinality; in the case of this particular , we were able to find an admissible tuple whose diameter was . On the other hand, an application of the large sieve inequalities shows that admissible -tuples asymptotically must have size at least (and we conjecture that the narrowest -tuple in fact has size ), so there is a definite limit to how much one can improve the bound on purely from finding ever narrower admissible tuples. (As part of the Polymath8a project, a database of narrow tuples was set up here (and is still accepting submissions), building upon previous data of Engelsma.)

To make further progress, one has to analyse how the result is proven. Here, Zhang follows the arguments of Goldston, Pintz, and Yildirim, which are based on constructing a sieve function , supported on (say) the interval for a large , such that the sum

has good upper bounds, and the sums

has good lower bounds for . Provided that the ratio between the lower and upper bounds is big enough, one can then easily deduce (essentially from the pigeonhole principle).

One then needs to find a good choice of , which on the one hand is simple enough that the sums (1), (2) can be bounded rigorously, but on the other hand are sophisticated enough that one gets a good ratio between (2) and (1). Goldston, Pintz, and Yildirim eventually settled on a choice essentially of the form

for some auxiliary parameter and some ; this is a variant of the Selberg sieve. With this choice, they were already able to establish upper bounds of as strong as on the Elliott-Halberstam conjecture, which asserts that

for all and and to obtain the weaker result without this conjecture. Furthermore, any nontrivial progress on the Elliott-Halberstam conjecture (beyond what is provided by the Bombieri-Vinogradov theorem, which covers the case ) would give some finite bound on .

Even after all the recent progress on bounded gaps, we still do not have any direct progress on the Elliott-Halberstam conjecture (3) for any . However, Zhang (and independently, Motohashi and Pintz) observed that one does not need the full strength of (3) in order to obtain the conclusions of Goldston-Pintz-Yildirim. Firstly, one does not need all residue classes here, but only those classes that are the roots of a certain polynomial. Secondly, one does not need all moduli here, but can restrict attention to *smooth* (or *friable* moduli – moduli with no large prime factors – as the error incurred by ignoring all other moduli turns out to be exponentially small in . With these caveats, Zhang was able to obtain a restricted form of (3) with as large as , which he then used to obtain as small as .

Actually, Zhang’s treatment of the truncation error is not optimal, and by being more careful here (and by relaxing the requirement of smooth moduli to the less stringent requirement of “densely divisible” moduli) we were able to reduce down to . Furthermore, by replacing the monomial with the more flexible cutoff and then optimising in (a computation first made in unpublished work of Conrey, and then in the paper of Farkas, Pintz, and Revesz, with the optimal turning out to come from a Bessel function), one could reduce to be as small as (leading to a bound of that ended up to be ).

To go beyond this, we had to unpack Zhang’s proof of (a weakened version of) the Elliott-Halberstam type bound (3). His approach follows a well known sequence of papers by Bombieri, Fouvry, Friedlander, and Iwaniec on various restricted breakthroughs beyond the Bombieri-Vinogradov barrier, although with the key difference that Zhang did not use automorphic form techniques, which (at our current level of understanding) are almost entirely restricted to the regime where the residue class is fixed in (as opposed to varying amongst the roots of a polynomial modulo , which is what is needed for the current application). However, the remaining steps are familiar: first one uses the Heath-Brown identity to decompose (a variant of) the expression in (3) into some simpler bilinear and trilinear sums, which Zhang called “Type I”, “Type II”, and “Type III” (though one should caution that these are slightly different from the “Type I” and “Type II” sums arising from Vaughan-type identities). The Type I and Type II sums turn out to be treatable using a careful combination of the Cauchy-Schwarz inequality (as embodied in tools such as the dispersion method of Linnik), the Polya-Vinogradov completion of sums method, and estimates on one-dimensional exponential sums (which are variants of Kloosterman sums) which can ultimately be handled by the Riemann hypothesis for curves over finite fields, first established by Weil (and which can in this particular context also be proven by the elementary method of Stepanov). The Type III sums can be treated by a variant of these methods, except that one-dimensional exponential sum estimates are insufficient; Zhang instead needed to turn to the three-dimensional exponential sum estimates of Birch and Bombieri to get an adequate amount of cancellation, and these estimates ultimately arose from the deep work of Deligne on the Riemann hypothesis for higher dimensional varieties (see this previous blog post for a discussion of these hypotheses).

In our work, we were able to improve the Cauchy-Schwarz components of these arguments in a number of ways, with the most significant gain coming from applying the “-van der Corput -process” of Graham and Ringrose to the Type I sums; we also have a slightly different way to handle the Type III sums (based on a recent preprint of Fouvry, Kowalski, and Michel), based on correlations of hyper-Kloosterman sums (again coming from Deligne’s work), which gives significantly better results for these sums (so much so, in fact, that the Type III sums are no longer the dominant obstruction to further improvement of the numerology). Putting all these computations together, we can stretch Zhang’s improvement to Bombieri-Vinogradov by about an order of magnitude, with now allowed to be as large as rather than . This leads to a value of as low as , which in turn leads to the bound . These latter bounds have since been improved by Maynard and by Polymath8b, mostly by significant improvements to the sieve-theoretic part of the argument (and no longer using any distributional result on the primes beyond the Bombieri-Vinogradov theorem), but the distribution result of Polymath8a is still the best distribution result known on the primes, and may well have other applications beyond the bounded gaps problem.

Interestingly, the -van der Corput -process is strong enough, in fact, that we can still get non-trivial bounds of (weakened versions of) the form (3) even if we don’t attempt to estimate the Type III sums, so in particular we can obtain a Zhang-type distribution theorem even without using Deligne’s theorems, with now reaching as large as .

Filed under: math.NT, paper Tagged: polymath8 ]]>

To state the results more precisely, recall that the Navier-Stokes equations can be written in the form

for a divergence-free velocity field and a pressure field , where is the viscosity, which we will normalise to be one. We will work in the non-periodic setting, so the spatial domain is , and for sake of exposition I will not discuss matters of regularity or decay of the solution (but we will always be working with strong notions of solution here rather than weak ones). Applying the Leray projection to divergence-free vector fields to this equation, we can eliminate the pressure, and obtain an evolution equation

purely for the velocity field, where is a certain bilinear operator on divergence-free vector fields (specifically, . The global regularity problem for Navier-Stokes is then equivalent to the global regularity problem for the evolution equation (1).

An important feature of the bilinear operator appearing in (1) is the cancellation law

(using the inner product on divergence-free vector fields), which leads in particular to the fundamental energy identity

This identity (and its consequences) provide essentially the only known *a priori* bound on solutions to the Navier-Stokes equations from large data and arbitrary times. Unfortunately, as discussed in this previous post, the quantities controlled by the energy identity are supercritical with respect to scaling, which is the fundamental obstacle that has defeated all attempts to solve the global regularity problem for Navier-Stokes without any additional assumptions on the data or solution (e.g. perturbative hypotheses, or *a priori* control on a critical norm such as the norm).

Our main result is then (slightly informally stated) as follows

Theorem 1There exists anaveragedversion of the bilinear operator , of the formfor some probability space , some spatial rotation operators for , and some Fourier multipliers of order , for which one still has the cancellation law

(There are some integrability conditions on the Fourier multipliers required in the above theorem in order for the conclusion to be non-trivial, but I am omitting them here for sake of exposition.)

Because spatial rotations and Fourier multipliers of order are bounded on most function spaces, automatically obeys almost all of the upper bound estimates that does. Thus, this theorem blocks any attempt to prove global regularity for the true Navier-Stokes equations which relies purely on the energy identity and on upper bound estimates for the nonlinearity; one must use some additional structure of the nonlinear operator which is not shared by an averaged version . Such additional structure certainly exists – for instance, the Navier-Stokes equation has a vorticity formulation involving only differential operators rather than pseudodifferential ones, whereas a general equation of the form (2) does not. However, “abstract” approaches to global regularity generally do not exploit such structure, and thus cannot be used to affirmatively answer the Navier-Stokes problem.

It turns out that the particular averaged bilinear operator that we will use will be a finite linear combination of *local cascade operators*, which take the form

where is a small parameter, are Schwartz vector fields whose Fourier transform is supported on an annulus, and is an -rescaled version of (basically a “wavelet” of wavelength about centred at the origin). Such operators were essentially introduced by Katz and Pavlovic as dyadic models for ; they have the essentially the same scaling property as (except that one can only scale along powers of , rather than over all positive reals), and in fact they can be expressed as an average of in the sense of the above theorem, as can be shown after a somewhat tedious amount of Fourier-analytic symbol manipulations.

If we consider nonlinearities which are a finite linear combination of local cascade operators, then the equation (2) more or less collapses to a system of ODE in certain “wavelet coefficients” of . The precise ODE that shows up depends on what precise combination of local cascade operators one is using. Katz and Pavlovic essentially considered a single cascade operator together with its “adjoint” (needed to preserve the energy identity), and arrived (more or less) at the system of ODE

where are scalar fields for each integer . (Actually, Katz-Pavlovic worked with a technical variant of this particular equation, but the differences are not so important for this current discussion.) Note that the quadratic terms on the RHS carry a higher exponent of than the dissipation term; this reflects the supercritical nature of this evolution (the energy is monotone decreasing in this flow, so the natural size of given the control on the energy is ). There is a slight technical issue with the dissipation if one wishes to embed (3) into an equation of the form (2), but it is minor and I will not discuss it further here.

In principle, if the mode has size comparable to at some time , then energy should flow from to at a rate comparable to , so that by time or so, most of the energy of should have drained into the mode (with hardly any energy dissipated). Since the series is summable, this suggests finite time blowup for this ODE as the energy races ever more quickly to higher and higher modes. Such a scenario was indeed established by Katz and Pavlovic (and refined by Cheskidov) if the dissipation strength was weakened somewhat (the exponent has to be lowered to be less than ). As mentioned above, this is enough to give a version of Theorem 1 in five and higher dimensions.

On the other hand, it was shown a few years ago by Barbato, Morandin, and Romito that (3) in fact admits global smooth solutions (at least in the dyadic case , and assuming non-negative initial data). Roughly speaking, the problem is that as energy is being transferred from to , energy is also simultaneously being transferred from to , and as such the solution races off to higher modes a bit too prematurely, without absorbing all of the energy from lower modes. This weakens the strength of the blowup to the point where the moderately strong dissipation in (3) is enough to kill the high frequency cascade before a true singularity occurs. Because of this, the original Katz-Pavlovic model cannot quite be used to establish Theorem 1 in three dimensions. (Actually, the original Katz-Pavlovic model had some additional dispersive features which allowed for another proof of global smooth solutions, which is an unpublished result of Nazarov.)

To get around this, I had to “engineer” an ODE system with similar features to (3) (namely, a quadratic nonlinearity, a monotone total energy, and the indicated exponents of for both the dissipation term and the quadratic terms), but for which the cascade of energy from scale to scale was not interrupted by the cascade of energy from scale to scale . To do this, I needed to insert a *delay* in the cascade process (so that after energy was dumped into scale , it would take some time before the energy would start to transfer to scale ), but the process also needed to be *abrupt* (once the process of energy transfer started, it needed to conclude very quickly, before the delayed transfer for the next scale kicked in). It turned out that one could build a “quadratic circuit” out of some basic “quadratic gates” (analogous to how an electrical circuit could be built out of basic gates such as amplifiers or resistors) that achieved this task, leading to an ODE system essentially of the form

where is a suitable large parameter and is a suitable small parameter (much smaller than ). To visualise the dynamics of such a system, I found it useful to describe this system graphically by a “circuit diagram” that is analogous (but not identical) to the circuit diagrams arising in electrical engineering:

The coupling constants here range widely from being very large to very small; in practice, this makes the and modes absorb very little energy, but exert a sizeable influence on the remaining modes. If a lot of energy is suddenly dumped into , what happens next is roughly as follows: for a moderate period of time, nothing much happens other than a trickle of energy into , which in turn causes a rapid exponential growth of (from a very low base). After this delay, suddenly crosses a certain threshold, at which point it causes and to exchange energy back and forth with extreme speed. The energy from then rapidly drains into , and the process begins again (with a slight loss in energy due to the dissipation). If one plots the total energy as a function of time, it looks schematically like this:

As in the previous heuristic discussion, the time between cascades from one frequency scale to the next decay exponentially, leading to blowup at some finite time . (One could describe the dynamics here as being similar to the famous “lighting the beacons” scene in the Lord of the Rings movies, except that (a) as each beacon gets ignited, the previous one is extinguished, as per the energy identity; (b) the time between beacon lightings decrease exponentially; and (c) there is no soundtrack.)

There is a real (but remote) possibility that this sort of construction can be adapted to the true Navier-Stokes equations. The basic blowup mechanism in the averaged equation is that of a von Neumann machine, or more precisely a construct (built within the laws of the inviscid evolution ) that, after some time delay, manages to suddenly create a replica of itself at a finer scale (and to largely erase its original instantiation in the process). In principle, such a von Neumann machine could also be built out of the laws of the inviscid form of the Navier-Stokes equations (i.e. the Euler equations). In physical terms, one would have to build the machine purely out of an ideal fluid (i.e. an inviscid incompressible fluid). If one could somehow create enough “logic gates” out of ideal fluid, one could presumably build a sort of “fluid computer”, at which point the task of building a von Neumann machine appears to reduce to a software engineering exercise rather than a PDE problem (providing that the gates are suitably stable with respect to perturbations, but (as with actual computers) this can presumably be done by converting the analog signals of fluid mechanics into a more error-resistant digital form). The key thing missing in this program (in both senses of the word) to establish blowup for Navier-Stokes is to construct the logic gates within the laws of ideal fluids. (Compare with the situation for cellular automata such as Conway’s “Game of Life“, in which Turing complete computers, universal constructors, and replicators have all been built within the laws of that game.)

Filed under: math.AP, math.CA, paper Tagged: dyadic models, finite time blowup, Navier-Stokes equations ]]>