You are currently browsing the category archive for the ‘paper’ category.

I have just uploaded to the arXiv my paper “On the universality of the incompressible Euler equation on compact manifolds, II. Non-rigidity of Euler flows“, submitted to Pure and Applied Functional Analysis. This paper continues my attempts to establish “universality” properties of the Euler equations on Riemannian manifolds , as I conjecture that the freedom to set the metric ought to allow one to “program” such Euler flows to exhibit a wide range of behaviour, and in particular to achieve finite time blowup (if the dimension is sufficiently large, at least).

In coordinates, the Euler equations read

where is the pressure field and is the velocity field, and denotes the Levi-Civita connection with the usual Penrose abstract index notation conventions; we restrict attention here to the case where are smooth and is compact, smooth, orientable, connected, and without boundary. Let’s call an *Euler flow* on (for the time interval ) if it solves the above system of equations for some pressure , and an *incompressible flow* if it just obeys the divergence-free relation . Thus every Euler flow is an incompressible flow, but the converse is certainly not true; for instance the various conservation laws of the Euler equation, such as conservation of energy, will already block most incompressible flows from being an Euler flow, or even being approximated in a reasonably strong topology by such Euler flows.

However, one can ask if an incompressible flow can be *extended* to an Euler flow by adding some additional dimensions to . In my paper, I formalise this by considering warped products of which (as a smooth manifold) are products of with a torus, with a metric given by

for , where are the coordinates of the torus , and are smooth positive coefficients for ; in order to preserve the incompressibility condition, we also require the volume preservation property

though in practice we can quickly dispose of this condition by adding one further “dummy” dimension to the torus . We say that an incompressible flow is *extendible to an Euler flow* if there exists a warped product extending , and an Euler flow on of the form

for some “swirl” fields . The situation here is motivated by the familiar situation of studying axisymmetric Euler flows on , which in cylindrical coordinates take the form

The base component

of this flow is then a flow on the two-dimensional plane which is not quite incompressible (due to the failure of the volume preservation condition (2) in this case) but still satisfies a system of equations (coupled with a passive scalar field that is basically the square of the swirl ) that is reminiscent of the Boussinesq equations.

On a fixed -dimensional manifold , let denote the space of incompressible flows , equipped with the smooth topology (in spacetime), and let denote the space of such flows that are extendible to Euler flows. Our main theorem is

Theorem 1

- (i) (Generic inextendibility) Assume . Then is of the first category in (the countable union of nowhere dense sets in ).
- (ii) (Non-rigidity) Assume (with an arbitrary metric ). Then is somewhere dense in (that is, the closure of has non-empty interior).

More informally, starting with an incompressible flow , one usually cannot extend it to an Euler flow just by extending the manifold, warping the metric, and adding swirl coefficients, even if one is allowed to select the dimension of the extension, as well as the metric and coefficients, arbitrarily. However, many such flows can be *perturbed* to be extendible in such a manner (though different perturbations will require different extensions, in particular the dimension of the extension will not be fixed). Among other things, this means that conservation laws such as energy (or momentum, helicity, or circulation) no longer present an obstruction when one is allowed to perform an extension (basically this is because the swirl components of the extension can exchange energy (or momentum, etc.) with the base components in a basically arbitrary fashion.

These results fall short of my hopes to use the ability to extend the manifold to create universal behaviour in Euler flows, because of the fact that each flow requires a different extension in order to achieve the desired dynamics. Still it does seem to provide a little bit of support to the idea that high-dimensional Euler flows are quite “flexible” in their behaviour, though not completely so due to the generic inextendibility phenomenon. This flexibility reminds me a little bit of the flexibility of weak solutions to equations such as the Euler equations provided by the “-principle” of Gromov and its variants (as discussed in these recent notes), although in this case the flexibility comes from adding additional dimensions, rather than by repeatedly adding high-frequency corrections to the solution.

The proof of part (i) of the theorem basically proceeds by a dimension counting argument (similar to that in the proof of Proposition 9 of these recent lecture notes of mine). Heuristically, the point is that an arbitrary incompressible flow is essentially determined by independent functions of space and time, whereas the warping factors are functions of space only, the pressure field is one function of space and time, and the swirl fields are technically functions of both space and time, but have the same number of degrees of freedom as a function just of space, because they solve an evolution equation. When , this means that there are fewer unknown functions of space and time than prescribed functions of space and time, which is the source of the generic inextendibility. This simple argument breaks down when , but we do not know whether the claim is actually false in this case.

The proof of part (ii) proceeds by direct calculation of the effect of the warping factors and swirl velocities, which effectively create a forcing term (of Boussinesq type) in the first equation of (1) that is a combination of functions of the Eulerian spatial coordinates (coming from the warping factors) and the Lagrangian spatial coordinates (which arise from the swirl velocities, which are passively transported by the flow). In a non-empty open subset of , the combination of these coordinates becomes a non-degenerate set of coordinates for spacetime, and one can then use the Stone-Weierstrass theorem to conclude. The requirement that be topologically a torus is a technical hypothesis in order to avoid topological obstructions such as the hairy ball theorem, but it may be that the hypothesis can be dropped (and it may in fact be true, in the case at least, that is dense in all of , not just in a non-empty open subset).

Kaisa Matomäki, Maksym Radziwill, and I just uploaded to the arXiv our paper “Fourier uniformity of bounded multiplicative functions in short intervals on average“. This paper is the outcome of our attempts during the MSRI program in analytic number theory last year to attack the local Fourier uniformity conjecture for the Liouville function . This conjecture generalises a landmark result of Matomäki and Radziwill, who show (among other things) that one has the asymptotic

whenever and goes to infinity as . Informally, this says that the Liouville function has small mean for almost all short intervals . The remarkable thing about this theorem is that there is no lower bound on how goes to infinity with ; one can take for instance . This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that

The local Fourier uniformity conjecture asserts the stronger asymptotic

under the same hypotheses on and . As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that

This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that for all primes ).

There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase is replaced with a polynomial phase such as , or more generally a nilsequence ; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.

The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:

Theorem 1The asymptotic (2) is true when for a fixed .

Previously this was known for by the work of Zhan (who in fact proved the stronger pointwise assertion for in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version

of (2) for any growing arbitrarily slowly with ; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case is permitted to depend on the parameter, whereas in (3) must remain independent of .

Unfortunately, the restriction is not strong enough to give applications to Chowla-type conjectures (one would need something more like for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic

whenever for a fixed , where is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for

is only known in the range (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes *easier* to control when one inserts a Liouville function!

We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of we would have the lower bound

for some frequency . We informally describe this correlation between and by writing

for (informally, one should view this as asserting that “behaves like” a constant multiple of ). For sake of discussion, suppose we have this relationship for *all* , not just *many*.

As mentioned before, the main difficulty here is to understand how varies with . As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let be a fairly small prime (e.g. of size for some ), and use the identity for the Liouville function to conclude (at least heuristically) from (4) that

for . (In practice, we will have this sort of claim for *many* primes rather than *all* primes , after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)

Now let and be primes comparable to some fixed range such that

and

on essentially the same range of (two nearby intervals of length ). This suggests that the frequencies and should be close to each other modulo , in particular one should expect the relationship

Comparing this with (5) one is led to the expectation that should depend inversely on in some sense (for instance one can check that

would solve (6) if ; by Taylor expansion, this would correspond to a global approximation of the form ). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on that resembles (7).

A key obstacle in solving (6) efficiently is the fact that one only knows that and are close modulo , rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say) by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship

whenever and obey (5). (This may force to become extremely large, on the order of , but this will not concern us.)

Now suppose that we have and primes such that

For every prime , we can find an such that is within of both and . Applying (8) twice we obtain

and

and thus by the triangle inequality we have

for all ; hence by the Chinese remainder theorem

In practice, in the regime that we are considering, the modulus is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have

whenever and obey (9).

Now let be an integer to be chosen later, and suppose we have primes such that the difference

is small but non-zero. If is chosen so that

(where one is somewhat loose about what means) then one can then find real numbers such that

for , with the convention that . We then have

which telescopes to

and thus

and hence

In particular, for each , we expect to be able to write

for some . This quantity can vary with ; but from (10) and a short calculation we see that

whenever obey (9) for some .

Now imagine a “graph” in which the vertices are elements of , and two elements are joined by an edge if (9) holds for some . Because of exponential sum estimates on , this graph turns out to essentially be an “expander” in the sense that any two vertices can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of or by ). As a consequence, we can assume that this quantity is essentially constant in (cf. the application of the ergodic theorem in this previous blog post), thus we now have

for most and some . By Taylor expansion, this implies that

on for most , thus

But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function is known to be non-pretentious).

About six years ago on this blog, I started thinking about trying to make a web-based game based around high-school algebra, and ended up using Scratch to write a short but playable puzzle game in which one solves linear equations for an unknown using a restricted set of moves. (At almost the same time, there were a number of more professionally made games released along similar lines, most notably Dragonbox.)

Since then, I have thought a couple times about whether there were other parts of mathematics which could be gamified in a similar fashion. Shortly after my first blog posts on this topic, I experimented with a similar gamification of Lewis Carroll’s classic list of logic puzzles, but the results were quite clunky, and I was never satisfied with the results.

Over the last few weeks I returned to this topic though, thinking in particular about how to gamify the rules of inference of propositional logic, in a manner that at least vaguely resembles how mathematicians actually go about making logical arguments (e.g., splitting into cases, arguing by contradiction, using previous result as lemmas to help with subsequent ones, and so forth). The rules of inference are a list of a dozen or so deductive rules concerning propositional sentences (things like “( AND ) OR (NOT )”, where are some formulas). A typical such rule is Modus Ponens: if the sentence is known to be true, and the implication “ IMPLIES ” is also known to be true, then one can deduce that is also true. Furthermore, in this deductive calculus it is possible to temporarily introduce some unproven statements as an assumption, only to discharge them later. In particular, we have the deduction theorem: if, after making an assumption , one is able to derive the statement , then one can conclude that the implication “ IMPLIES ” is true without any further assumption.

It took a while for me to come up with a workable game-like graphical interface for all of this, but I finally managed to set one up, now using Javascript instead of Scratch (which would be hopelessly inadequate for this task); indeed, part of the motivation of this project was to finally learn how to program in Javascript, which turned out to be not as formidable as I had feared (certainly having experience with other C-like languages like C++, Java, or lua, as well as some prior knowledge of HTML, was very helpful). The main code for this project is available here. Using this code, I have created an interactive textbook in the style of a computer game, which I have titled “QED”. This text contains thirty-odd exercises arranged in twelve sections that function as game “levels”, in which one has to use a given set of rules of inference, together with a given set of hypotheses, to reach a desired conclusion. The set of available rules increases as one advances through the text; in particular, each new section gives one or more rules, and additionally each exercise one solves automatically becomes a new deduction rule one can exploit in later levels, much as lemmas and propositions are used in actual mathematics to prove more difficult theorems. The text automatically tries to match available deduction rules to the sentences one clicks on or drags, to try to minimise the amount of manual input one needs to actually make a deduction.

Most of one’s proof activity takes place in a “root environment” of statements that are known to be true (under the given hypothesis), but for more advanced exercises one has to also work in sub-environments in which additional assumptions are made. I found the graphical metaphor of nested boxes to be useful to depict this tree of sub-environments, and it seems to combine well with the drag-and-drop interface.

The text also logs one’s moves in a more traditional proof format, which shows how the mechanics of the game correspond to a traditional mathematical argument. My hope is that this will give students a way to understand the underlying concept of forming a proof in a manner that is more difficult to achieve using traditional, non-interactive textbooks.

I have tried to organise the exercises in a game-like progression in which one first works with easy levels that train the player on a small number of moves, and then introduce more advanced moves one at a time. As such, the order in which the rules of inference are introduced is a little idiosyncratic. The most powerful rule (the law of the excluded middle, which is what separates classical logic from intuitionistic logic) is saved for the final section of the text.

Anyway, I am now satisfied enough with the state of the code and the interactive text that I am willing to make both available (and open source; I selected a CC-BY licence for both), and would be happy to receive feedback on any aspect of the either. In principle one could extend the game mechanics to other mathematical topics than the propositional calculus – the rules of inference for first-order logic being an obvious next candidate – but it seems to make sense to focus just on propositional logic for now.

I have just uploaded to the arXiv my paper “Commutators close to the identity“, submitted to the Journal of Operator Theory. This paper resulted from some progress I made on the problem discussed in this previous post. Recall in that post the following result of Popa: if are bounded operators on a Hilbert space whose commutator is close to the identity in the sense that

for some , then one has the lower bound

In the other direction, for any , there are examples of operators obeying (1) such that

In this paper we improve the upper bound to come closer to the lower bound:

Theorem 1For any , and any infinite-dimensional , there exist operators obeying (1) such that

One can probably improve the exponent somewhat by a modification of the methods, though it does not seem likely that one can lower it all the way to without a substantially new idea. Nevertheless I believe it plausible that the lower bound (2) is close to optimal.

We now sketch the methods of proof. The construction giving (3) proceeded by first identifying with the algebra of matrices that have entries in . It is then possible to find two matrices whose commutator takes the form

for some bounded operator (for instance one can take to be an isometry). If one then conjugates by the diagonal operator , one can eusure that (1) and (3) both hold.

It is natural to adapt this strategy to matrices rather than matrices, where is a parameter at one’s disposal. If one can find matrices that are almost upper triangular (in that only the entries on or above the lower diagonal are non-zero), whose commutator only differs from the identity in the top right corner, thus

for some , then by conjugating by a diagonal matrix such as for some and optimising in , one can improve the bound in (3) to ; if the bounds in the implied constant in the are polynomial in , one can then optimise in to obtain a bound of the form (4) (perhaps with the exponent replaced by a different constant).

The task is then to find almost upper triangular matrices whose commutator takes the required form. The lower diagonals of must then commute; it took me a while to realise then that one could (usually) conjugate one of the matrices, say by a suitable diagonal matrix, so that the lower diagonal consisted entirely of the identity operator, which would make the other lower diagonal consist of a single operator, say . After a lot of further lengthy experimentation, I eventually realised that one could conjugate further by unipotent upper triangular matrices so that all remaining entries other than those on the far right column vanished. Thus, without too much loss of generality, one can assume that takes the normal form

for some , solving the system of equations

It turns out to be possible to solve this system of equations by a contraction mapping argument if one takes to be a “Hilbert’s hotel” pair of isometries as in the previous post, though the contraction is very slight, leading to polynomial losses in in the implied constant.

There is a further question raised in Popa’s paper which I was unable to resolve. As a special case of one of the main theorems (Theorem 2.1) of that paper, the following result was shown: if obeys the bounds

(where denotes the space of all operators of the form with and compact), then there exist operators with such that . (In fact, Popa’s result covers a more general situation in which one is working in a properly infinite algebra with non-trivial centre.) We sketch a proof of this result as follows. Suppose that and for some . A standard greedy algorithm argument (see this paper of Brown and Pearcy) allows one to find orthonormal vectors for such that for each , one has for some comparable to , and some orthogonal to all of the . After some conjugation (and a suitable identification of with , one can thus place in a normal form

where is a isometry with infinite deficiency, and have norm . Setting , it then suffices to solve the commutator equation

with ; note the similarity with (3).

By the usual Hilbert’s hotel construction, one can complement with another isometry obeying the “Hilbert’s hotel” identity

and also , . Proceeding as in the previous post, we can try the ansatz

for some operators , leading to the system of equations

Using the first equation to solve for , the second to then solve for , and the third to then solve for , one can obtain matrices with the required properties.

Thus far, my attempts to extend this construction to larger matrices with good bounds on have been unsuccessful. A model problem would be to express

as a commutator with significantly smaller than . The construction in my paper achieves something like this, but with replaced by a more complicated operator. One would also need variants of this result in which one is allowed to perturb the above operator by an arbitrary finite rank operator of bounded operator norm.

Kevin Ford, Sergei Konyagin, James Maynard, Carl Pomerance, and I have uploaded to the arXiv our paper “Long gaps in sieved sets“, submitted to J. Europ. Math. Soc..

This paper originated from the MSRI program in analytic number theory last year, and was centred around variants of the question of finding large gaps between primes. As discussed for instance in this previous post, it is now known that within the set of primes , one can find infinitely many adjacent elements whose gap obeys a lower bound of the form

where denotes the -fold iterated logarithm. This compares with the trivial bound of that one can obtain from the prime number theorem and the pigeonhole principle. Several years ago, Pomerance posed the question of whether analogous improvements to the trivial bound can be obtained for such sets as

Here there is the obvious initial issue that this set is not even known to be infinite (this is the fourth Landau problem), but let us assume for the sake of discussion that this set is indeed infinite, so that we have an infinite number of gaps to speak of. Standard sieve theory techniques give upper bounds for the density of that is comparable (up to an absolute constant) to the prime number theorem bounds for , so again we can obtain a trivial bound of for the gaps of . In this paper we improve this to

for an absolute constant ; this is not as strong as the corresponding bound for , but still improves over the trivial bound. In fact we can handle more general “sifted sets” than just . Recall from the sieve of Eratosthenes that the elements of in, say, the interval can be obtained by removing from one residue class modulo for each prime up to , namely the class mod . In a similar vein, the elements of in can be obtained by removing for each prime up to zero, one, or two residue classes modulo , depending on whether is a quadratic residue modulo . On the average, one residue class will be removed (this is a very basic case of the Chebotarev density theorem), so this sieving system is “one-dimensional on the average”. Roughly speaking, our arguments apply to any other set of numbers arising from a sieving system that is one-dimensional on average. (One can consider other dimensions also, but unfortunately our methods seem to give results that are worse than a trivial bound when the dimension is less than or greater than one.)

The standard “Erdős-Rankin” method for constructing long gaps between primes proceeds by trying to line up some residue classes modulo small primes so that they collectively occupy a long interval. A key tool in doing so are the smooth number estimates of de Bruijn and others, which among other things assert that if one removes from an interval such as all the residue classes mod for between and for some fixed , then the set of survivors has exceptionally small density (roughly of the order of , with the precise density given by the Dickman function), in marked contrast to the situation in which one randomly removes one residue class for each such prime , in which the density is more like . One generally exploits this phenomenon to sieve out almost all the elements of a long interval using some of the primes available, and then using the remaining primes to cover up the remaining elements that have not already been sifted out. In the more recent work on this problem, advanced combinatorial tools such as hypergraph covering lemmas are used for the latter task.

In the case of , there does not appear to be any analogue of smooth numbers, in the sense that there is no obvious way to arrange the residue classes so that they have significantly fewer survivors than a random arrangement. Instead we adopt the following semi-random strategy to cover an interval by residue classes. Firstly, we randomly remove residue classes for primes up to some intermediate threshold (smaller than by a logarithmic factor), leaving behind a preliminary sifted set . Then, for each prime between and another intermediate threshold , we remove a residue class mod that maximises (or nearly maximises) its intersection with . This ends up reducing the number of survivors to be significantly below what one would achieve if one selects residue classes randomly, particularly if one also uses the hypergraph covering lemma from our previous paper. Finally, we cover each the remaining survivors by a residue class from a remaining available prime.

Brad Rodgers and I have uploaded to the arXiv our paper “The De Bruijn-Newman constant is non-negative“. This paper affirms a conjecture of Newman regarding to the extent to which the Riemann hypothesis, if true, is only “barely so”. To describe the conjecture, let us begin with the Riemann xi function

where is the Gamma function and is the Riemann zeta function. Initially, this function is only defined for , but, as was already known to Riemann, we can manipulate it into a form that extends to the entire complex plane as follows. Firstly, in view of the standard identity , we can write

and hence

By a rescaling, one may write

and similarly

and thus (after applying Fubini’s theorem)

We’ll make the change of variables to obtain

If we introduce the mild renormalisation

of , we then conclude (at least for ) that

which one can verify to be rapidly decreasing both as and as , with the decrease as faster than any exponential. In particular extends holomorphically to the upper half plane.

If we normalize the Fourier transform of a (Schwartz) function as , it is well known that the Gaussian is its own Fourier transform. The creation operator interacts with the Fourier transform by the identity

Since , this implies that the function

is its own Fourier transform. (One can view the polynomial as a renormalised version of the fourth Hermite polynomial.) Taking a suitable linear combination of this with , we conclude that

is also its own Fourier transform. Rescaling by and then multiplying by , we conclude that the Fourier transform of

is

and hence by the Poisson summation formula (using symmetry and vanishing at to unfold the summation in (2) to the integers rather than the natural numbers) we obtain the functional equation

which implies that and are even functions (in particular, now extends to an entire function). From this symmetry we can also rewrite (1) as

which now gives a convergent expression for the entire function for all complex . As is even and real-valued on , is even and also obeys the functional equation , which is equivalent to the usual functional equation for the Riemann zeta function. The Riemann hypothesis is equivalent to the claim that all the zeroes of are real.

De Bruijn introduced the family of deformations of , defined for all and by the formula

From a PDE perspective, one can view as the evolution of under the backwards heat equation . As with , the are all even entire functions that obey the functional equation , and one can ask an analogue of the Riemann hypothesis for each such , namely whether all the zeroes of are real. De Bruijn showed that these hypotheses were monotone in : if had all real zeroes for some , then would also have all zeroes real for any . Newman later sharpened this claim by showing the existence of a finite number , now known as the *de Bruijn-Newman constant*, with the property that had all zeroes real if and only if . Thus, the Riemann hypothesis is equivalent to the inequality . Newman then conjectured the complementary bound ; in his words, this conjecture asserted that if the Riemann hypothesis is true, then it is only “barely so”, in that the reality of all the zeroes is destroyed by applying heat flow for even an arbitrarily small amount of time. Over time, a significant amount of evidence was established in favour of this conjecture; most recently, in 2011, Saouter, Gourdon, and Demichel showed that .

In this paper we finish off the proof of Newman’s conjecture, that is we show that . The proof is by contradiction, assuming that (which among other things, implies the truth of the Riemann hypothesis), and using the properties of backwards heat evolution to reach a contradiction.

Very roughly, the argument proceeds as follows. As observed by Csordas, Smith, and Varga (and also discussed in this previous blog post, the backwards heat evolution of the introduces a nice ODE dynamics on the zeroes of , namely that they solve the ODE

for all (one has to interpret the sum in a principal value sense as it is not absolutely convergent, but let us ignore this technicality for the current discussion). Intuitively, this ODE is asserting that the zeroes repel each other, somewhat like positively charged particles (but note that the dynamics is first-order, as opposed to the second-order laws of Newtonian mechanics). Formally, a steady state (or equilibrium) of this dynamics is reached when the are arranged in an arithmetic progression. (Note for instance that for any positive , the functions obey the same backwards heat equation as , and their zeroes are on a fixed arithmetic progression .) The strategy is to then show that the dynamics from time to time creates a *convergence to local equilibrium*, in which the zeroes locally resemble an arithmetic progression at time . This will be in contradiction with known results on pair correlation of zeroes (or on related statistics, such as the fluctuations on gaps between zeroes), such as the results of Montgomery (actually for technical reasons it is slightly more convenient for us to use related results of Conrey, Ghosh, Goldston, Gonek, and Heath-Brown). Another way of thinking about this is that even very slight deviations from local equilibrium (such as a small number of gaps that are slightly smaller than the average spacing) will almost immediately lead to zeroes colliding with each other and leaving the real line as one evolves backwards in time (i.e., under the *forward* heat flow). This is a refinement of the strategy used in previous lower bounds on , in which “Lehmer pairs” (pairs of zeroes of the zeta function that were unusually close to each other) were used to limit the extent to which the evolution continued backwards in time while keeping all zeroes real.

How does one obtain this convergence to local equilibrium? We proceed by broad analogy with the “local relaxation flow” method of Erdos, Schlein, and Yau in random matrix theory, in which one combines some initial control on zeroes (which, in the case of the Erdos-Schlein-Yau method, is referred to with terms such as “local semicircular law”) with convexity properties of a relevant Hamiltonian that can be used to force the zeroes towards equilibrium.

We first discuss the initial control on zeroes. For , we have the classical Riemann-von Mangoldt formula, which asserts that the number of zeroes in the interval is as . (We have a factor of here instead of the more familiar due to the way is normalised.) This implies for instance that for a fixed , the number of zeroes in the interval is . Actually, because we get to assume the Riemann hypothesis, we can sharpen this to , a result of Littlewood (see this previous blog post for a proof). Ideally, we would like to obtain similar control for the other , , as well. Unfortunately we were only able to obtain the weaker claims that the number of zeroes of in is , and that the number of zeroes in is , that is to say we only get good control on the distribution of zeroes at scales rather than at scales . Ultimately this is because we were only able to get control (and in particular, lower bounds) on with high precision when (whereas has good estimates as soon as is larger than (say) ). This control is obtained by the expressing in terms of some contour integrals and using the method of steepest descent (actually it is slightly simpler to rely instead on the Stirling approximation for the Gamma function, which can be proven in turn by steepest descent methods). Fortunately, it turns out that this weaker control is still (barely) enough for the rest of our argument to go through.

Once one has the initial control on zeroes, we now need to force convergence to local equilibrium by exploiting convexity of a Hamiltonian. Here, the relevant Hamiltonian is

ignoring for now the rather important technical issue that this sum is not actually absolutely convergent. (Because of this, we will need to truncate and renormalise the Hamiltonian in a number of ways which we will not detail here.) The ODE (3) is formally the gradient flow for this Hamiltonian. Furthermore, this Hamiltonian is a convex function of the (because is a convex function on ). We therefore expect the Hamiltonian to be a decreasing function of time, and that the derivative should be an increasing function of time. As time passes, the derivative of the Hamiltonian would then be expected to converge to zero, which should imply convergence to local equilibrium.

Formally, the derivative of the above Hamiltonian is

Again, there is the important technical issue that this quantity is infinite; but it turns out that if we renormalise the Hamiltonian appropriately, then the energy will also become suitably renormalised, and in particular will vanish when the are arranged in an arithmetic progression, and be positive otherwise. One can also formally calculate the derivative of to be a somewhat complicated but manifestly non-negative quantity (a sum of squares); see this previous blog post for analogous computations in the case of heat flow on polynomials. After flowing from time to time , and using some crude initial bounds on and in this region (coming from the Riemann-von Mangoldt type formulae mentioned above and some further manipulations), we can eventually show that the (renormalisation of the) energy at time zero is small, which forces the to locally resemble an arithmetic progression, which gives the required convergence to local equilibrium.

There are a number of technicalities involved in making the above sketch of argument rigorous (for instance, justifying interchanges of derivatives and infinite sums turns out to be a little bit delicate). I will highlight here one particular technical point. One of the ways in which we make expressions such as the energy finite is to truncate the indices to an interval to create a truncated energy . In typical situations, we would then expect to be decreasing, which will greatly help in bounding (in particular it would allow one to control by time-averaged quantities such as , which can in turn be controlled using variants of (4)). However, there are boundary effects at both ends of that could in principle add a large amount of energy into , which is bad news as it could conceivably make undesirably large even if integrated energies such as remain adequately controlled. As it turns out, such boundary effects are negligible as long as there is a large gap between adjacent zeroes at boundary of – it is only narrow gaps that can rapidly transmit energy across the boundary of . Now, narrow gaps can certainly exist (indeed, the GUE hypothesis predicts these happen a positive fraction of the time); but the pigeonhole principle (together with the Riemann-von Mangoldt formula) can allow us to pick the endpoints of the interval so that no narrow gaps appear at the boundary of for any given time . However, there was a technical problem: this argument did not allow one to find a single interval that avoided gaps for *all* times simultaneously – the pigeonhole principle could produce a different interval for each time ! Since the number of times was uncountable, this was a serious issue. (In physical terms, the problem was that there might be very fast “longitudinal waves” in the dynamics that, at each time, cause some gaps between zeroes to be highly compressed, but the specific gap that was narrow changed very rapidly with time. Such waves could, in principle, import a huge amount of energy into by time .) To resolve this, we borrowed a PDE trick of Bourgain’s, in which the pigeonhole principle was coupled with local conservation laws. More specifically, we use the phenomenon that very narrow gaps take a nontrivial amount of time to expand back to a reasonable size (this can be seen by comparing the evolution of this gap with solutions of the scalar ODE , which represents the fastest at which a gap such as can expand). Thus, if a gap is reasonably large at some time , it will also stay reasonably large at slightly earlier times for some moderately small . This lets one locate an interval that has manageable boundary effects during the times in , so in particular is basically non-increasing in this time interval. Unfortunately, this interval is a little bit too short to cover all of ; however it turns out that one can iterate the above construction and find a nested sequence of intervals , with each non-increasing in a different time interval , and with all of the time intervals covering . This turns out to be enough (together with the obvious fact that is monotone in ) to still control for some reasonably sized interval , as required for the rest of the arguments.

ADDED LATER: the following analogy (involving functions with just two zeroes, rather than an infinite number of zeroes) may help clarify the relation between this result and the Riemann hypothesis (and in particular why this result does not make the Riemann hypothesis any easier to prove, in fact it confirms the delicate nature of that hypothesis). Suppose one had a quadratic polynomial of the form , where was an unknown real constant. Suppose that one was for some reason interested in the analogue of the “Riemann hypothesis” for , namely that all the zeroes of are real. A priori, there are three scenarios:

- (Riemann hypothesis false) , and has zeroes off the real axis.
- (Riemann hypothesis true, but barely so) , and both zeroes of are on the real axis; however, any slight perturbation of in the positive direction would move zeroes off the real axis.
- (Riemann hypothesis true, with room to spare) , and both zeroes of are on the real axis. Furthermore, any slight perturbation of will also have both zeroes on the real axis.

The analogue of our result in this case is that , thus ruling out the third of the three scenarios here. In this simple example in which only two zeroes are involved, one can think of the inequality as asserting that if the zeroes of are real, then they must be repeated. In our result (in which there are an infinity of zeroes, that become increasingly dense near infinity), and in view of the convergence to local equilibrium properties of (3), the analogous assertion is that if the zeroes of are real, then they do not behave locally as if they were in arithmetic progression.

Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions II. Divisor correlations in short ranges“. This is a sequel of sorts to our previous paper on divisor correlations, though the proof techniques in this paper are rather different. As with the previous paper, our interest is in correlations such as

for medium-sized and large , where are natural numbers and is the divisor function (actually our methods can also treat a generalisation in which is non-integer, but for simplicity let us stick with the integer case for this discussion). Our methods also allow for one of the divisor function factors to be replaced with a von Mangoldt function, but (in contrast to the previous paper) we cannot treat the case when both factors are von Mangoldt.

As discussed in this previous post, one heuristically expects an asymptotic of the form

for any fixed , where is a certain explicit (but rather complicated) polynomial of degree . Such asymptotics are known when , but remain open for . In the previous paper, we were able to obtain a weaker bound of the form

for of the shifts , whenever the shift range lies between and . But the methods become increasingly hard to use as gets smaller. In this paper, we use a rather different method to obtain the even weaker bound

for of the shifts , where can now be as short as . The constant can be improved, but there are serious obstacles to using our method to go below (as the exceptionally large values of then begin to dominate). This can be viewed as an analogue to our previous paper on correlations of bounded multiplicative functions on average, in which the functions are now unbounded, and indeed our proof strategy is based in large part on that paper (but with many significant new technical complications).

We now discuss some of the ingredients of the proof. Unsurprisingly, the first step is the circle method, expressing (1) in terms of exponential sums such as

Actually, it is convenient to first prune slightly by zeroing out this function on “atypical” numbers that have an unusually small or large number of factors in a certain sense, but let us ignore this technicality for this discussion. The contribution of for “major arc” can be treated by standard techniques (and is the source of the main term ; the main difficulty comes from treating the contribution of “minor arc” .

In our previous paper on bounded multiplicative functions, we used Plancherel’s theorem to estimate the global norm , and then also used the Katai-Bourgain-Sarnak-Ziegler orthogonality criterion to control local norms , where was a minor arc interval of length about , and these two estimates together were sufficient to get a good bound on correlations by an application of Hölder’s inequality. For , it is more convenient to use Dirichlet series methods (and Ramaré-type factorisations of such Dirichlet series) to control local norms on minor arcs, in the spirit of the proof of the Matomaki-Radziwill theorem; a key point is to develop “log-free” mean value theorems for Dirichlet series associated to functions such as , so as not to wipe out the (rather small) savings one will get over the trivial bound from this method. On the other hand, the global bound will definitely be unusable, because the sum has too many unwanted factors of . Fortunately, we can substitute this global bound with a “large values” bound that controls expressions such as

for a moderate number of disjoint intervals , with a bound that is slightly better (for a medium-sized power of ) than what one would have obtained by bounding each integral separately. (One needs to save more than for the argument to work; we end up saving a factor of about .) This large values estimate is probably the most novel contribution of the paper. After taking the Fourier transform, matters basically reduce to getting a good estimate for

where is the midpoint of ; thus we need some upper bound on the large local Fourier coefficients of . These coefficients are difficult to calculate directly, but, in the spirit of a paper of Ben Green and myself, we can try to replace by a more tractable and “pseudorandom” majorant for which the local Fourier coefficients are computable (on average). After a standard duality argument, one ends up having to control expressions such as

after various averaging in the parameters. These local Fourier coefficients of turn out to be small on average unless is “major arc”. One then is left with a mostly combinatorial problem of trying to bound how often this major arc scenario occurs. This is very close to a computation in the previously mentioned paper of Ben and myself; there is a technical wrinkle in that the are not as well separated as they were in my paper with Ben, but it turns out that one can modify the arguments in that paper to still obtain a satisfactory estimate in this case (after first grouping nearby frequencies together, and modifying the duality argument accordingly).

I have just uploaded to the arXiv the paper “An inverse theorem for an inequality of Kneser“, submitted to a special issue of the Proceedings of the Steklov Institute of Mathematics in honour of Sergei Konyagin. It concerns an inequality of Kneser discussed previously in this blog, namely that

whenever are compact non-empty subsets of a compact connected additive group with probability Haar measure . (A later result of Kemperman extended this inequality to the nonabelian case.) This inequality is non-trivial in the regime

The connectedness of is essential, otherwise one could form counterexamples involving proper subgroups of of positive measure. In the blog post, I indicated how this inequality (together with a more “robust” strengthening of it) could be deduced from submodularity inequalities such as

which in turn easily follows from the identity and the inclusion , combined with the inclusion-exclusion formula.

In the non-trivial regime (2), equality can be attained in (1), for instance by taking to be the unit circle and to be arcs in that circle (obeying (2)). A bit more generally, if is an arbitrary connected compact abelian group and is a non-trivial character (i.e., a continuous homomorphism), then must be surjective (as has no non-trivial connected subgroups), and one can take and for some arcs in that circle (again choosing the measures of these arcs to obey (2)). The main result of this paper is an inverse theorem that asserts that this is the only way in which equality can occur in (1) (assuming (2)); furthermore, if (1) is close to being satisfied with equality and (2) holds, then must be close (in measure) to an example of the above form . Actually, for technical reasons (and for the applications we have in mind), it is important to establish an inverse theorem not just for (1), but for the more robust version mentioned earlier (in which the sumset is replaced by the partial sumset consisting of “popular” sums).

Roughly speaking, the idea is as follows. Let us informally call a *critical pair* if (2) holds and the inequality (1) (or more precisely, a robust version of this inequality) is almost obeyed with equality. The notion of a critical pair obeys some useful closure properties. Firstly, it is symmetric in , and invariant with respect to translation of either or . Furthermore, from the submodularity inequality (3), one can show that if and are critical pairs (with and positive), then and are also critical pairs. (Note that this is consistent with the claim that critical pairs only occur when come from arcs of a circle.) Similarly, from associativity , one can show that if and are critical pairs, then so are and .

One can combine these closure properties to obtain further ones. For instance, suppose is such that . Then (cheating a little bit), one can show that is also a critical pair, basically because is the union of the , , the are all critical pairs, and the all intersect each other. This argument doesn’t quite work as stated because one has to apply the closure property under union an uncountable number of times, but it turns out that if one works with the robust version of sumsets and uses a random sampling argument to approximate by the union of finitely many of the , then the argument can be made to work.

Using all of these closure properties, it turns out that one can start with an arbitrary critical pair and end up with a small set such that and are also critical pairs for all (say), where is the -fold sumset of . (Intuitively, if are thought of as secretly coming from the pullback of arcs by some character , then should be the pullback of a much shorter arc by the same character.) In particular, exhibits linear growth, in that for all . One can now use standard technology from inverse sumset theory to show first that has a very large Fourier coefficient (and thus is biased with respect to some character ), and secondly that is in fact almost of the form for some arc , from which it is not difficult to conclude similar statements for and and thus finish the proof of the inverse theorem.

In order to make the above argument rigorous, one has to be more precise about what the modifier “almost” means in the definition of a critical pair. I chose to do this in the language of “cheap” nonstandard analysis (aka asymptotic analysis), as discussed in this previous blog post; one could also have used the full-strength version of nonstandard analysis, but this does not seem to convey any substantial advantages. (One can also work in a more traditional “non-asymptotic” framework, but this requires one to keep much more careful account of various small error terms and leads to a messier argument.)

*[Update, Nov 15: Corrected the attribution of the inequality (1) to Kneser instead of Kemperman. Thanks to John Griesmer for pointing out the error.]*

Joni Teräväinen and I have just uploaded to the arXiv our paper “Odd order cases of the logarithmically averaged Chowla conjecture“, submitted to J. Numb. Thy. Bordeaux. This paper gives an alternate route to one of the main results of our previous paper, and more specifically reproves the asymptotic

for all odd and all integers (that is to say, all the odd order cases of the logarithmically averaged Chowla conjecture). Our previous argument relies heavily on some deep ergodic theory results of Bergelson-Host-Kra, Leibman, and Le (and was applicable to more general multiplicative functions than the Liouville function ); here we give a shorter proof that avoids ergodic theory (but instead requires the Gowers uniformity of the (W-tricked) von Mangoldt function, established in several papers of Ben Green, Tamar Ziegler, and myself). The proof follows the lines sketched in the previous blog post. In principle, due to the avoidance of ergodic theory, the arguments here have a greater chance to be made quantitative; however, at present the known bounds on the Gowers uniformity of the von Mangoldt function are qualitative, except at the level, which is unfortunate since the first non-trivial odd case requires quantitative control on the level. (But it may be possible to make the Gowers uniformity bounds for quantitative if one assumes GRH, although when one puts everything together, the actual decay rate obtained in (1) is likely to be poor.)

## Recent Comments