You are currently browsing the category archive for the ‘paper’ category.

I’ve just uploaded to the arXiv my paper “On the universality of potential well dynamics“, submitted to Dynamics of PDE. This is a spinoff from my previous paper on blowup of nonlinear wave equations, inspired by some conversations with Sungjin Oh. Here we focus mainly on the zero-dimensional case of such equations, namely the potential well equation

$\displaystyle \partial_{tt} u = - (\nabla F)(u) \ \ \ \ \ (1)$

for a particle ${u: {\bf R} \rightarrow {\bf R}^m}$ trapped in a potential well with potential ${F: {\bf R}^m \rightarrow {\bf R}}$, with ${F(z) \rightarrow +\infty}$ as ${z \rightarrow \infty}$. This ODE always admits global solutions from arbitrary initial positions ${u(0)}$ and initial velocities ${\partial_t u(0)}$, thanks to conservation of the Hamiltonian ${\frac{1}{2} |\partial_t u|^2 + F(u)}$. As this Hamiltonian is coercive (in that its level sets are compact), solutions to this equation are always almost periodic. On the other hand, as can already be seen using the harmonic oscillator ${\partial_{tt} u = - k^2 u}$ (and direct sums of this system), this equation can generate periodic solutions, as well as quasiperiodic solutions.

All quasiperiodic motions are almost periodic. However, there are many examples of dynamical systems that admit solutions that are almost periodic but not quasiperiodic. So one can pose the question: are the dynamics of potential wells universal in the sense that they can capture all almost periodic solutions?

A precise question can be phrased as follows. Let ${M}$ be a compact manifold, and let ${X}$ be a smooth vector field on ${M}$; to avoid degeneracies, let us take ${X}$ to be non-singular in the sense that it is everywhere non-vanishing. Then the trajectories of the first-order ODE

$\displaystyle \partial_t u = X(u) \ \ \ \ \ (2)$

for ${u: {\bf R} \rightarrow M}$ are always global and almost periodic. Can we then find a (coercive) potential ${F: {\bf R}^m \rightarrow {\bf R}}$ for some ${m}$, as well as a smooth embedding ${\phi: M \rightarrow {\bf R}^m}$, such that every solution ${u}$ to (2) pushes forward under ${\phi}$ to a solution to (1)? (Actually, for technical reasons it is preferable to map into the phase space ${{\bf R}^m \times {\bf R}^m}$, rather than position space ${{\bf R}^m}$, but let us ignore this detail for this discussion.)

It turns out that the answer is no; there is a very specific obstruction. Given a pair ${(M,X)}$ as above, define a strongly adapted ${1}$-form to be a ${1}$-form ${\phi}$ on ${M}$ such that ${\phi(X)}$ is pointwise positive, and the Lie derivative ${{\mathcal L}_X \phi}$ is an exact ${1}$-form. We then have

Theorem 1 A smooth compact non-singular dynamics ${(M,X)}$ can be embedded smoothly in a potential well system if and only if it admits a strongly adapted ${1}$-form.

For the “only if” direction, the key point is that potential wells (viewed as a Hamiltonian flow on the phase space ${{\bf R}^m \times {\bf R}^m}$) admit a strongly adapted ${1}$-form, namely the canonical ${1}$-form ${p dq}$, whose Lie derivative is the derivative ${dL}$ of the Lagrangian ${L := \frac{1}{2} |\partial_t u|^2 - F(u)}$ and is thus exact. The converse “if” direction is mainly a consequence of the Nash embedding theorem, and follows the arguments used in my previous paper.

Interestingly, the same obstruction also works for potential wells in a more general Riemannian manifold than ${{\bf R}^m}$, or for nonlinear wave equations with a potential; combining the two, the obstruction is also present for wave maps with a potential.

It is then natural to ask whether this obstruction is non-trivial, in the sense that there are at least some examples of dynamics ${(M,X)}$ that do not support strongly adapted ${1}$-forms (and hence cannot be modeled smoothly by the dynamics of a potential well, nonlinear wave equation, or wave maps). I posed this question on MathOverflow, and Robert Bryant provided a very nice construction, showing that the vector field ${(\sin(2\pi x), \cos(2\pi x))}$ on the ${2}$-torus ${({\bf R}/{\bf Z})^2}$ had no strongly adapted ${1}$-forms, and hence the dynamics of this vector field cannot be smoothly reproduced by a potential well, nonlinear wave equation, or wave map:

On the other hand, the suspension of any diffeomorphism does support a strongly adapted ${1}$-form (the derivative ${dt}$ of the time coordinate), and using this and the previous theorem I was able to embed a universal Turing machine into a potential well. In particular, there are flows for an explicitly describable potential well whose trajectories have behavior that is undecidable using the usual ZFC axioms of set theory! So potential well dynamics are “effectively” universal, despite the presence of the aforementioned obstruction.

In my previous work on blowup for Navier-Stokes like equations, I speculated that if one could somehow replicate a universal Turing machine within the Euler equations, one could use this machine to create a “von Neumann machine” that replicated smaller versions of itself, which on iteration would lead to a finite time blowup. Now that such a mechanism is present in nonlinear wave equations, it is tempting to try to make this scheme work in that setting. Of course, in my previous paper I had already demonstrated finite time blowup, at least in a three-dimensional setting, but that was a relatively simple discretely self-similar blowup in which no computation occurred. This more complicated blowup scheme would be significantly more effort to set up, but would be proof-of-concept that the same scheme would in principle be possible for the Navier-Stokes equations, assuming somehow that one can embed a universal Turing machine into the Euler equations. (But I’m still hopelessly stuck on how to accomplish this latter task…)

Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions I. Long shift ranges“, submitted to Proceedings of the London Mathematical Society. This paper is concerned with the estimation of correlations such as

$\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+h) \ \ \ \ \ (1)$

for medium-sized ${h}$ and large ${X}$, where ${\Lambda}$ is the von Mangoldt function; we also consider variants of this sum in which one of the von Mangoldt functions is replaced with a (higher order) divisor function, but for sake of discussion let us focus just on the sum (1). Understanding this sum is very closely related to the problem of finding pairs of primes that differ by ${h}$; for instance, if one could establish a lower bound

$\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+2) \gg X$

then this would easily imply the twin prime conjecture.

The (first) Hardy-Littlewood conjecture asserts an asymptotic

$\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+h) = {\mathfrak S}(h) X + o(X) \ \ \ \ \ (2)$

as ${X \rightarrow \infty}$ for any fixed positive ${h}$, where the singular series ${{\mathfrak S}(h)}$ is an arithmetic factor arising from the irregularity of distribution of ${\Lambda}$ at small moduli, defined explicitly by

$\displaystyle {\mathfrak S}(h) := 2 \Pi_2 \prod_{p|h; p>2} \frac{p-2}{p-1}$

when ${h}$ is even, and ${{\mathfrak S}(h)=0}$ when ${h}$ is odd, where

$\displaystyle \Pi_2 := \prod_{p>2} (1-\frac{1}{(p-1)^2}) = 0.66016\dots$

is (half of) the twin prime constant. See for instance this previous blog post for a a heuristic explanation of this conjecture. From the previous discussion we see that (2) for ${h=2}$ would imply the twin prime conjecture. Sieve theoretic methods are only able to provide an upper bound of the form ${ \sum_{n \leq X} \Lambda(n) \Lambda(n+h) \ll {\mathfrak S}(h) X}$.

Needless to say, apart from the trivial case of odd ${h}$, there are no values of ${h}$ for which the Hardy-Littlewood conjecture is known. However there are some results that say that this conjecture holds “on the average”: in particular, if ${H}$ is a quantity depending on ${X}$ that is somewhat large, there are results that show that (2) holds for most (i.e. for ${1-o(1)}$) of the ${h}$ betwen ${0}$ and ${H}$. Ideally one would like to get ${H}$ as small as possible, in particular one can view the full Hardy-Littlewood conjecture as the endpoint case when ${H}$ is bounded.

The first results in this direction were by van der Corput and by Lavrik, who established such a result with ${H = X}$ (with a subsequent refinement by Balog); Wolke lowered ${H}$ to ${X^{5/8+\varepsilon}}$, and Mikawa lowered ${H}$ further to ${X^{1/3+\varepsilon}}$. The main result of this paper is a further lowering of ${H}$ to ${X^{8/33+\varepsilon}}$. In fact (as in the preceding works) we get a better error term than ${o(X)}$, namely an error of the shape ${O_A( X \log^{-A} X)}$ for any ${A}$.

Our arguments initially proceed along standard lines. One can use the Hardy-Littlewood circle method to express the correlation in (2) as an integral involving exponential sums ${S(\alpha) := \sum_{n \leq X} \Lambda(n) e(\alpha n)}$. The contribution of “major arc” ${\alpha}$ is known by a standard computation to recover the main term ${{\mathfrak S}(h) X}$ plus acceptable errors, so it is a matter of controlling the “minor arcs”. After averaging in ${h}$ and using the Plancherel identity, one is basically faced with establishing a bound of the form

$\displaystyle \int_{\beta-1/H}^{\beta+1/H} |S(\alpha)|^2\ d\alpha \ll_A X \log^{-A} X$

for any “minor arc” ${\beta}$. If ${\beta}$ is somewhat close to a low height rational ${a/q}$ (specifically, if it is within ${X^{-1/6-\varepsilon}}$ of such a rational with ${q = O(\log^{O(1)} X)}$), then this type of estimate is roughly of comparable strength (by another application of Plancherel) to the best available prime number theorem in short intervals on the average, namely that the prime number theorem holds for most intervals of the form ${[x, x + x^{1/6+\varepsilon}]}$, and we can handle this case using standard mean value theorems for Dirichlet series. So we can restrict attention to the “strongly minor arc” case where ${\beta}$ is far from such rationals.

The next step (following some ideas we found in a paper of Zhan) is to rewrite this estimate not in terms of the exponential sums ${S(\alpha) := \sum_{n \leq X} \Lambda(n) e(\alpha n)}$, but rather in terms of the Dirichlet polynomial ${F(s) := \sum_{n \sim X} \frac{\Lambda(n)}{n^s}}$. After a certain amount of computation (including some oscillatory integral estimates arising from stationary phase), one is eventually reduced to the task of establishing an estimate of the form

$\displaystyle \int_{t \sim \lambda X} (\sum_{t-\lambda H}^{t+\lambda H} |F(\frac{1}{2}+it')|\ dt')^2\ dt \ll_A \lambda^2 H^2 X \log^{-A} X$

for any ${X^{-1/6-\varepsilon} \ll \lambda \ll \log^{-B} X}$ (with ${B}$ sufficiently large depending on ${A}$).

The next step, which is again standard, is the use of the Heath-Brown identity (as discussed for instance in this previous blog post) to split up ${\Lambda}$ into a number of components that have a Dirichlet convolution structure. Because the exponent ${8/33}$ we are shooting for is less than ${1/4}$, we end up with five types of components that arise, which we call “Type ${d_1}$“, “Type ${d_2}$“, “Type ${d_3}$“, “Type ${d_4}$“, and “Type II”. The “Type II” sums are Dirichlet convolutions involving a factor supported on a range ${[X^\varepsilon, X^{-\varepsilon} H]}$ and is quite easy to deal with; the “Type ${d_j}$” terms are Dirichlet convolutions that resemble (non-degenerate portions of) the ${j^{th}}$ divisor function, formed from convolving together ${j}$ portions of ${1}$. The “Type ${d_1}$” and “Type ${d_2}$” terms can be estimated satisfactorily by standard moment estimates for Dirichlet polynomials; this already recovers the result of Mikawa (and our argument is in fact slightly more elementary in that no Kloosterman sum estimates are required). It is the treatment of the “Type ${d_3}$” and “Type ${d_4}$” sums that require some new analysis, with the Type ${d_3}$ terms turning to be the most delicate. After using an existing moment estimate of Jutila for Dirichlet L-functions, matters reduce to obtaining a family of estimates, a typical one of which (relating to the more difficult Type ${d_3}$ sums) is of the form

$\displaystyle \int_{t - H}^{t+H} |M( \frac{1}{2} + it')|^2\ dt' \ll X^{\varepsilon^2} H \ \ \ \ \ (3)$

for “typical” ordinates ${t}$ of size ${X}$, where ${M}$ is the Dirichlet polynomial ${M(s) := \sum_{n \sim X^{1/3}} \frac{1}{n^s}}$ (a fragment of the Riemann zeta function). The precise definition of “typical” is a little technical (because of the complicated nature of Jutila’s estimate) and will not be detailed here. Such a claim would follow easily from the Lindelof hypothesis (which would imply that ${M(1/2 + it) \ll X^{o(1)}}$) but of course we would like to have an unconditional result.

At this point, having exhausted all the Dirichlet polynomial estimates that are usefully available, we return to “physical space”. Using some further Fourier-analytic and oscillatory integral computations, we can estimate the left-hand side of (3) by an expression that is roughly of the shape

$\displaystyle \frac{H}{X^{1/3}} \sum_{\ell \sim X^{1/3}/H} |\sum_{m \sim X^{1/3}} e( \frac{t}{2\pi} \log \frac{m+\ell}{m-\ell} )|.$

The phase ${\frac{t}{2\pi} \log \frac{m+\ell}{m-\ell}}$ can be Taylor expanded as the sum of ${\frac{t_j \ell}{\pi m}}$ and a lower order term ${\frac{t_j \ell^3}{3\pi m^3}}$, plus negligible errors. If we could discard the lower order term then we would get quite a good bound using the exponential sum estimates of Robert and Sargos, which control averages of exponential sums with purely monomial phases, with the averaging allowing us to exploit the hypothesis that ${t}$ is “typical”. Figuring out how to get rid of this lower order term caused some inefficiency in our arguments; the best we could do (after much experimentation) was to use Fourier analysis to shorten the sums, estimate a one-parameter average exponential sum with a binomial phase by a two-parameter average with a monomial phase, and then use the van der Corput ${B}$ process followed by the estimates of Robert and Sargos. This rather complicated procedure works up to ${H = X^{8/33+\varepsilon}}$ it may be possible that some alternate way to proceed here could improve the exponent somewhat.

In a sequel to this paper, we will use a somewhat different method to reduce ${H}$ to a much smaller value of ${\log^{O(1)} X}$, but only if we replace the correlations ${\sum_{n \leq X} \Lambda(n) \Lambda(n+h)}$ by either ${\sum_{n \leq X} \Lambda(n) d_k(n+h)}$ or ${\sum_{n \leq X} d_k(n) d_l(n+h)}$, and also we now only save a ${o(1)}$ in the error term rather than ${O_A(\log^{-A} X)}$.

In July I will be spending a week at Park City, being one of the mini-course lecturers in the Graduate Summer School component of the Park City Summer Session on random matrices.  I have chosen to give some lectures on least singular values of random matrices, the circular law, and the Lindeberg exchange method in random matrix theory; this is a slightly different set of topics than I had initially advertised (which was instead about the Lindeberg exchange method and the local relaxation flow method), but after consulting with the other mini-course lecturers I felt that this would be a more complementary set of topics.  I have uploaded an draft of my lecture notes (some portion of which is derived from my monograph on the subject); as always, comments and corrections are welcome.

<I>[Update, June 23: notes revised and reformatted to PCMI format. -T.]</I>

Ben Green and I have (finally!) uploaded to the arXiv our paper “New bounds for Szemerédi’s theorem, III: A polylogarithmic bound for ${r_4(N)}$“, submitted to Mathematika. This is the sequel to two previous papers (and an erratum to the former paper), concerning quantitative versions of Szemerédi’s theorem in the case of length four progressions. This sequel has been delayed for over a decade for a number of reasons, but we have finally managed to write the arguments up to our satisfaction and submit it (to a special issue of Mathematika honouring the work of Klaus Roth).

For any natural number ${N}$, define ${r_4(N)}$ to be the largest cardinality of a subset ${A}$ of ${[N] = \{1,\dots,N\}}$ which does not contain any non-trivial arithmetic progressions ${a, a+r, a+2r, a+3r}$ of length four (where “non-trivial” means that ${r}$ is non-zero). Trivially we have ${r_4(N) \leq N}$. In 1969, Szemerédi showed that ${r_4(N) = o(N)}$. However, the decay rate that could be theoretically extracted from this argument (and from several subsequent proofs of this bound, including one by Roth) were quite poor. The first significant quantitative bound on this quantity was by Gowers, who showed that ${r_4(N) \ll N (\log \log N)^{-c}}$ for some absolute constant ${c>0}$. In the second paper in the above-mentioned series, we managed to improve this bound to ${r_4(N) \ll N \exp( - c \sqrt{\log \log N})}$. In this paper, we improve the bound further to ${r_4(N) \ll N (\log N)^{-c}}$, which seems to be the limit of the methods. (We remark that if we could take ${c}$ to be larger than one, this would imply the length four case of a well known conjecture of Erdös that any set of natural numbers whose sum of reciprocals diverges would contain arbitrarily long arithmetic progressions. Thanks to the work of Sanders and of Bloom, the corresponding case of the conjecture for length three conjectures is nearly settled, as it is known that for the analogous bound on ${r_3(N)}$ one can take any ${c}$ less than one.)

Most of the previous work on bounding ${r_4(N)}$ relied in some form or another on the density increment argument introduced by Roth back in 1953; roughly speaking, the idea is to show that if a dense subset ${A}$ of ${[N]}$ fails to contain arithmetic progressions of length four, one seeks to then locate a long subprogression of ${[N]}$ in which ${A}$ has increased density. This was the basic method for instance underlying our previous bound ${r_4(N) \ll N \exp( - c \sqrt{\log \log N})}$, as well as a finite field analogue of the bound ${r_4(N) \ll N (\log N)^{-c}}$; however we encountered significant technical difficulties for several years in extending this argument to obtain the result of the current paper. Our method is instead based on “energy increment arguments”, and more specifically on establishing quantitative version of a Khintchine-type recurrence theorem, similar to the qualitative recurrence theorems established (in the ergodic theory context) by Bergelson-Host-Kra, and (in the current combinatorial context) by Ben Green and myself.

One way to phrase the latter recurrence theorem is as follows. Suppose that ${A \subset [N]}$ has density ${\delta}$. Then one would expect a “randomly” selected arithmetic progression ${{\bf a}, {\bf a}+{\bf r}, {\bf a}+2{\bf r}, {\bf a}+3{\bf r}}$ in ${[N]}$ (using the convention that random variables will be in boldface) to be contained in ${A}$ with probability about ${\delta^4}$. This is not true in general, however it was shown by Ben and myself that for any ${\eta>0}$, there was a set of shifts ${r \in [-N,N]}$ of cardinality ${\gg_{\delta,\eta} N}$, such that for any such ${r}$ one had

$\displaystyle {\bf P}( {\bf a}, {\bf a}+r, {\bf a}+2r, {\bf a}+3r \in A ) \geq \delta^4 - \eta$

if ${{\bf a}}$ was chosen uniformly at random from ${[N]}$. This easily implies that ${r_4(N) = o(N)}$, but does not give a particularly good bound on the decay rate, because the implied constant in the cardinality lower bound ${\gg_{\delta,\eta} N}$ is quite poor (in fact of tower-exponential type, due to the use of regularity lemmas!), and so one has to take ${N}$ to be extremely large compared to ${\delta,\eta}$ to avoid the possibility that the set of shifts in the above theorem consists only of the trivial shift ${r=0}$.

We do not know how to improve the lower bound on the set of shifts to the point where it can give bounds that are competitive with those in this paper. However, we can obtain better quantitative results if we permit ourselves to couple together the two parameters ${{\bf a}}$ and ${{\bf r}}$ of the length four progression. Namely, with ${A}$, ${\delta}$, ${\eta}$ as above, we are able to show that there exist random variables ${{\bf a}, {\bf r}}$, not necessarily independent, such that

$\displaystyle {\bf P}( {\bf a}, {\bf a}+{\bf r}, {\bf a}+2{\bf r}, {\bf a}+3{\bf r} \in A ) \geq \delta^4 - \eta \ \ \ \ \ (1)$

and such that we have the non-degeneracy bound

$\displaystyle {\bf P}( {\bf r} = 0 ) \ll \exp( - \eta^{-O(1)} ) / N.$

This then easily implies the main theorem.

The energy increment method is then deployed to locate a good pair ${({\bf a}, {\bf r})}$ of random variables that will obey the above bounds. One can get some intuition on how to proceed here by considering some model cases. Firstly one can consider a “globally quadratically structured” case in which the indicator function ${1_A}$ “behaves like” a globally quadratic function such as ${F( \alpha n^2 )}$, for some irrational ${\alpha}$ and some smooth periodic function ${F: {\bf R}/{\bf Z} \rightarrow {\bf R}}$ of mean ${\delta}$. If one then takes ${{\bf a}, {\bf r}}$ to be uniformly distributed in ${[N]}$ and ${[-\varepsilon N, \varepsilon N]}$ respectively for some small ${\varepsilon>0}$, with no coupling between the two variables, then the left-hand side of (1) is approximately of the form

$\displaystyle \int_{(x,y,z,w) \in ({\bf R}/{\bf Z})^4: x-3y+3z-w = 0} F(x) F(y) F(z) F(w) \ \ \ \ \ (2)$

where the integral is with respect to the probability Haar measure, and the constraint ${x-3y+3z-w=0}$ ultimately arises from the algebraic constraint

$\displaystyle \alpha {\bf a}^2 - 3 \alpha ({\bf a}+{\bf r})^2 + 3 \alpha ({\bf a}+2{\bf r})^2 - \alpha ({\bf a}+3{\bf r})^2 = 0.$

However, an application of the Cauchy-Schwarz inequality and Fubini’s theorem shows that the integral in (2) is at least ${(\int_{{\bf R}/{\bf Z}} F)^4}$, which (morally at least) gives (1) in this case.

Due to the nature of the energy increment argument, it also becomes necessary to consider “locally quadratically structured” cases, in which ${[N]}$ is partitioned into some number of structured pieces ${B_c}$ (think of these as arithmetic progressions, or as “Bohr sets), and on each piece ${B_c}$, ${1_A}$ behaves like a locally quadratic function such as ${F_c( \alpha_c n^2 )}$, where ${\alpha_c}$ now varies with ${c}$, and the mean of ${F_c}$ will be approximately ${\delta}$ on the average after averaging in ${c}$ (weighted by the size of the pieces ${B_c}$). Now one should select ${{\bf a}}$ and ${{\bf r}}$ in the following coupled manner: first one chooses ${{\bf a}}$ uniformly from ${[N]}$, then one defines ${{\bf c}}$ to be the label ${c}$ such that ${{\bf a} \in B_c}$, and then selects ${{\bf r}}$ uniformly from a set ${B_{c,\varepsilon}}$ which is related to ${B_c}$ in much the same way that ${[-\varepsilon N, \varepsilon N]}$ is related to ${[N]}$. If one does this correctly, the analogue of (2) becomes

$\displaystyle {\bf E} \int_{(x,y,z,w) \in ({\bf R}/{\bf Z})^4: x-3y+3z-w = 0} F_{\mathbf c}(x) F_{\mathbf c}(y) F_{\mathbf c}(z) F_{\mathbf c}(w),$

and one can again use Cauchy-Schwarz and Fubini’s theorem to conclude.

The general case proceeds, very roughly, by an iterative argument. At each stage of the iteration, one has some sort of quadratic model of ${1_A}$ which involves a decomposition of ${[N]}$ into structured pieces ${B_c}$, and a quadratic approximation to ${1_A}$ on each piece. If this approximation is accurate enough (or more precisely, if a certain (averaged) local Gowers uniformity norm ${U^3}$ of the error is small enough) to model the count in (1) (for random variables ${{\bf a}, {\bf r}}$ determined by the above partition of ${[N]}$ into pieces ${B_c}$), and if the frequencies (such as ${\alpha_c}$) involved in the quadratic approximation are “high rank” or “linearly independent over the rationals” in a suitably quantitative sense, then some version of the above arguments can be made to work. If there are some unwanted linear dependencies in the frequencies, we can do some linear algebra to eliminate one of the frequencies (using some geometry of numbers to keep the quantitative bounds under control) and continue the iteration. If instead the approximation is too inaccurate, then the error will be large in a certain averaged local Gowers uniformity norm ${U^3}$. A significant fraction of the paper is then devoted to establishing a quantitative inverse theorem for that norm that concludes (with good bounds) that the error must then locally correlate with locally quadratic phases, which can be used to refine the quadratic approximation to ${1_A}$ in a manner that significantly increases its “energy” (basically an ${L^2}$ norm). Such energy increments cannot continue indefinitely, and when they terminate we obtain the desired claim.

There are existing inverse theorems for ${U^3}$ type norms in the literature, going back to the pioneering work of Gowers mentioned previously, and relying on arithmetic combinatorics tools such as Freiman’s theorem and the Balog-Szemerédi-Gowers lemma, which are good for analysing the “${1\%}$-structured homomorphisms” that arise in Gowers’ argument. However, when we applied these methods to the local Gowers norms we obtained inferior quantitative results that were not strong enough for our application. Instead, we use arguments from a different paper of Gowers in which he tackled Szemerédi’s theorem for arbitrary length progressions. This method produces “${99\%}$-structured homomorphisms” associated to any function with large Gowers uniformity norm; however the catch is that such homomorphisms are initially supported only on a sparse unstructured set, rather than a structured set such as a Bohr set. To proceed further, one first has to locate inside the sparse unstructured set a sparse pseudorandom subset of a Bohr set, and then use “error-correction” type methods (such as “majority-vote” based algorithms) to locally upgrade this ${99\%}$-structured homomorphism on pseudorandom subsets of Bohr sets to a ${100\%}$-structured homomorphism on the entirety of a Bohr set. It is then possible to use some “approximate cohomology” tools to “integrate” these homomorphisms (and discern a key “local symmetry” property of these homomorphisms) to locate the desired local quadratic structure (in much the same fashion that a ${1}$-form on ${{\bf R}^n}$ that varies linearly with the coordinates can be integrated to be the derivative of a quadratic function if we know that the ${1}$-form is closed). These portions of the paper are unfortunately rather technical, but broadly follow the methods already used in previous literature.

Daniel Kane and I have just uploaded to the arXiv our paper “A bound on partitioning clusters“, submitted to the Electronic Journal of Combinatorics. In this short and elementary paper, we consider a question that arose from biomathematical applications: given a finite family ${X}$ of sets (or “clusters”), how many ways can there be of partitioning a set ${A \in X}$ in this family as the disjoint union ${A = A_1 \uplus A_2}$ of two other sets ${A_1, A_2}$ in this family? That is to say, what is the best upper bound one can place on the quantity

$\displaystyle | \{ (A,A_1,A_2) \in X^3: A = A_1 \uplus A_2 \}|$

in terms of the cardinality ${|X|}$ of ${X}$? A trivial upper bound would be ${|X|^2}$, since this is the number of possible pairs ${(A_1,A_2)}$, and ${A_1,A_2}$ clearly determine ${A}$. In our paper, we establish the improved bound

$\displaystyle | \{ (A,A_1,A_2) \in X^3: A = A_1 \uplus A_2 \}| \leq |X|^{3/p}$

where ${p}$ is the somewhat strange exponent

$\displaystyle p := \log_3 \frac{27}{4} = 1.73814\dots, \ \ \ \ \ (1)$

so that ${3/p = 1.72598\dots}$. Furthermore, this exponent is best possible!

Actually, the latter claim is quite easy to show: one takes ${X}$ to be all the subsets of ${\{1,\dots,n\}}$ of cardinality either ${n/3}$ or ${2n/3}$, for ${n}$ a multiple of ${3}$, and the claim follows readily from Stirling’s formula. So it is perhaps the former claim that is more interesting (since many combinatorial proof techniques, such as those based on inequalities such as the Cauchy-Schwarz inequality, tend to produce exponents that are rational or at least algebraic). We follow the common, though unintuitive, trick of generalising a problem to make it simpler. Firstly, one generalises the bound to the “trilinear” bound

$\displaystyle | \{ (A_1,A_2,A_3) \in X_1 \times X_2 \times X_3: A_3 = A_1 \uplus A_2 \}|$

$\displaystyle \leq |X_1|^{1/p} |X_2|^{1/p} |X_3|^{1/p}$

for arbitrary finite collections ${X_1,X_2,X_3}$ of sets. One can place all the sets in ${X_1,X_2,X_3}$ inside a single finite set such as ${\{1,\dots,n\}}$, and then by replacing every set ${A_3}$ in ${X_3}$ by its complement in ${\{1,\dots,n\}}$, one can phrase the inequality in the equivalent form

$\displaystyle | \{ (A_1,A_2,A_3) \in X_1 \times X_2 \times X_3: \{1,\dots,n\} =A_1 \uplus A_2 \uplus A_3 \}|$

$\displaystyle \leq |X_1|^{1/p} |X_2|^{1/p} |X_3|^{1/p}$

for arbitrary collections ${X_1,X_2,X_3}$ of subsets of ${\{1,\dots,n\}}$. We generalise further by turning sets into functions, replacing the estimate with the slightly stronger convolution estimate

$\displaystyle f_1 * f_2 * f_3 (1,\dots,1) \leq \|f_1\|_{\ell^p(\{0,1\}^n)} \|f_2\|_{\ell^p(\{0,1\}^n)} \|f_3\|_{\ell^p(\{0,1\}^n)}$

for arbitrary functions ${f_1,f_2,f_3}$ on the Hamming cube ${\{0,1\}^n}$, where the convolution is on the integer lattice ${\bf Z}^n$ rather than on the finite field vector space ${\bf F}_2^n$. The advantage of working in this general setting is that it becomes very easy to apply induction on the dimension ${n}$; indeed, to prove this estimate for arbitrary ${n}$ it suffices to do so for ${n=1}$. This reduces matters to establishing the elementary inequality

$\displaystyle (ab(1-c))^{1/p} + (bc(1-a))^{1/p} + (ca(1-b))^{1/p} \leq 1$

for all ${0 \leq a,b,c \leq 1}$, which can be done by a combination of undergraduate multivariable calculus and a little bit of numerical computation. (The left-hand side turns out to have local maxima at ${(1,1,0), (1,0,1), (0,1,1), (2/3,2/3,2/3)}$, with the latter being the cause of the numerology (1).)

The same sort of argument also gives an energy bound

$\displaystyle E(A,A) \leq |A|^{\log_2 6}$

for any subset ${A \subset \{0,1\}^n}$ of the Hamming cube, where

$\displaystyle E(A,A) := |\{(a_1,a_2,a_3,a_4) \in A^4: a_1+a_2 = a_3 + a_4 \}|$

is the additive energy of ${A}$. The example ${A = \{0,1\}^n}$ shows that the exponent ${\log_2 6}$ cannot be improved.

I’ve just uploaded to the arXiv my paper Finite time blowup for a supercritical defocusing nonlinear Schrödinger system, submitted to Analysis and PDE. This paper is an analogue of a recent paper of mine in which I constructed a supercritical defocusing nonlinear wave (NLW) system ${-\partial_{tt} u + \Delta u = (\nabla F)(u)}$ which exhibited smooth solutions that developed singularities in finite time. Here, we achieve essentially the same conclusion for the (inhomogeneous) supercritical defocusing nonlinear Schrödinger (NLS) equation

$\displaystyle i \partial_t u + \Delta u = (\nabla F)(u) + G \ \ \ \ \ (1)$

where ${u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}^m}$ is now a system of scalar fields, ${F: {\bf C}^m \rightarrow {\bf R}}$ is a potential which is strictly positive and homogeneous of degree ${p+1}$ (and invariant under phase rotations ${u \mapsto e^{i\theta} u}$), and ${G: {\bf R} \times {\bf R}^d \rightarrow {\bf C}^m}$ is a smooth compactly supported forcing term, needed for technical reasons.

To oversimplify somewhat, the equation (1) is known to be globally regular in the energy-subcritical case when ${d \leq 2}$, or when ${d \geq 3}$ and ${p < 1+\frac{4}{d-2}}$; global regularity is also known (but is significantly more difficult to establish) in the energy-critical case when ${d \geq 3}$ and ${p = 1 +\frac{4}{d-2}}$. (This is an oversimplification for a number of reasons, in particular in higher dimensions one only knows global well-posedness instead of global regularity. See this previous post for some exploration of this issue in the context of nonlinear wave equations.) The main result of this paper is to show that global regularity can break down in the remaining energy-supercritical case when ${d \geq 3}$ and ${p > 1 + \frac{4}{d-2}}$, at least when the target dimension ${m}$ is allowed to be sufficiently large depending on the spatial dimension ${d}$ (I did not try to achieve the optimal value of ${m}$ here, but the argument gives a value of ${m}$ that grows quadratically in ${d}$). Unfortunately, this result does not directly impact the most interesting case of the defocusing scalar NLS equation

$\displaystyle i \partial_t u + \Delta u = |u|^{p-1} u \ \ \ \ \ (2)$

in which ${m=1}$; however it does establish a rigorous barrier to any attempt to prove global regularity for the scalar NLS equation, in that such an attempt needs to crucially use some property of the scalar NLS that is not shared by the more general systems in (1). For instance, any approach that is primarily based on the conservation laws of mass, momentum, and energy (which are common to both (1) and (2)) will not be sufficient to establish global regularity of supercritical defocusing scalar NLS.

The method of proof in this paper is broadly similar to that in the previous paper for NLW, but with a number of additional technical complications. Both proofs begin by reducing matters to constructing a discretely self-similar solution. In the case of NLW, this solution lived on a forward light cone ${\{ (t,x): |x| \leq t \}}$ and obeyed a self-similarity

$\displaystyle u(2t, 2x) = 2^{-\frac{2}{p-1}} u(t,x).$

The ability to restrict to a light cone arose from the finite speed of propagation properties of NLW. For NLS, the solution will instead live on the domain

$\displaystyle H_d := ([0,+\infty) \times {\bf R}^d) \backslash \{(0,0)\}$

and obey a parabolic self-similarity

$\displaystyle u(4t, 2x) = 2^{-\frac{2}{p-1}} u(t,x)$

and solve the homogeneous version ${G=0}$ of (1). (The inhomogeneity ${G}$ emerges when one truncates the self-similar solution so that the initial data is compactly supported in space.) A key technical point is that ${u}$ has to be smooth everywhere in ${H_d}$, including the boundary component ${\{ (0,x): x \in {\bf R}^d \backslash \{0\}\}}$. This unfortunately rules out many of the existing constructions of self-similar solutions, which typically will have some sort of singularity at the spatial origin.

The remaining steps of the argument can broadly be described as quantifier elimination: one systematically eliminates each of the degrees of freedom of the problem in turn by locating the necessary and sufficient conditions required of the remaining degrees of freedom in order for the constraints of a particular degree of freedom to be satisfiable. The first such degree of freedom to eliminate is the potential function ${F}$. The task here is to determine what constraints must exist on a putative solution ${u}$ in order for there to exist a (positive, homogeneous, smooth away from origin) potential ${F}$ obeying the homogeneous NLS equation

$\displaystyle i \partial_t u + \Delta u = (\nabla F)(u).$

Firstly, the requirement that ${F}$ be homogeneous implies the Euler identity

$\displaystyle \langle (\nabla F)(u), u \rangle = (p+1) F(u)$

(where ${\langle,\rangle}$ denotes the standard real inner product on ${{\bf C}^m}$), while the requirement that ${F}$ be phase invariant similarly yields the variant identity

$\displaystyle \langle (\nabla F)(u), iu \rangle = 0,$

so if one defines the potential energy field to be ${V = F(u)}$, we obtain from the chain rule the equations

$\displaystyle \langle i \partial_t u + \Delta u, u \rangle = (p+1) V$

$\displaystyle \langle i \partial_t u + \Delta u, iu \rangle = 0$

$\displaystyle \langle i \partial_t u + \Delta u, \partial_t u \rangle = \partial_t V$

$\displaystyle \langle i \partial_t u + \Delta u, \partial_{x_j} u \rangle = \partial_{x_j} V.$

Conversely, it turns out (roughly speaking) that if one can locate fields ${u}$ and ${V}$ obeying the above equations (as well as some other technical regularity and non-degeneracy conditions), then one can find an ${F}$ with all the required properties. The first of these equations can be thought of as a definition of the potential energy field ${V}$, and the other three equations are basically disguised versions of the conservation laws of mass, energy, and momentum respectively. The construction of ${F}$ relies on a classical extension theorem of Seeley that is a relative of the Whitney extension theorem.

Now that the potential ${F}$ is eliminated, the next degree of freedom to eliminate is the solution field ${u}$. One can observe that the above equations involving ${u}$ and ${V}$ can be expressed instead in terms of ${V}$ and the Gram-type matrix ${G[u,u]}$ of ${u}$, which is a ${(2d+4) \times (2d+4)}$ matrix consisting of the inner products ${\langle D_1 u, D_2 u \rangle}$ where ${D_1,D_2}$ range amongst the ${2d+4}$ differential operators

$\displaystyle D_1,D_2 \in \{ 1, i, \partial_t, i\partial_t, \partial_{x_1},\dots,\partial_{x_d}, i\partial_{x_1}, \dots, i\partial_{x_d}\}.$

To eliminate ${u}$, one thus needs to answer the question of what properties are required of a ${(2d+4) \times (2d+4)}$ matrix ${G}$ for it to be the Gram-type matrix ${G = G[u,u]}$ of a field ${u}$. Amongst some obvious necessary conditions are that ${G}$ needs to be symmetric and positive semi-definite; there are also additional constraints coming from identities such as

$\displaystyle \partial_t \langle u, u \rangle = 2 \langle u, \partial_t u \rangle$

$\displaystyle \langle i u, \partial_t u \rangle = - \langle u, i \partial_t u \rangle$

and

$\displaystyle \partial_{x_j} \langle iu, \partial_{x_k} u \rangle - \partial_{x_k} \langle iu, \partial_{x_j} u \rangle = 2 \langle i \partial_{x_j} u, \partial_{x_k} u \rangle.$

Ideally one would like a theorem that asserts (for ${m}$ large enough) that as long as ${G}$ obeys all of the “obvious” constraints, then there exists a suitably non-degenerate map ${u}$ such that ${G = G[u,u]}$. In the case of NLW, the analogous claim was basically a consequence of the Nash embedding theorem (which can be viewed as a theorem about the solvability of the system of equations ${\langle \partial_{x_j} u, \partial_{x_k} u \rangle = g_{jk}}$ for a given positive definite symmetric set of fields ${g_{jk}}$). However, the presence of the complex structure in the NLS case poses some significant technical challenges (note for instance that the naive complex version of the Nash embedding theorem is false, due to obstructions such as Liouville’s theorem that prevent a compact complex manifold from being embeddable holomorphically in ${{\bf C}^m}$). Nevertheless, by adapting the proof of the Nash embedding theorem (in particular, the simplified proof of Gunther that avoids the need to use the Nash-Moser iteration scheme) we were able to obtain a partial complex analogue of the Nash embedding theorem that sufficed for our application; it required an artificial additional “curl-free” hypothesis on the Gram-type matrix ${G[u,u]}$, but fortunately this hypothesis ends up being automatic in our construction. Also, this version of the Nash embedding theorem is unable to prescribe the component ${\langle \partial_t u, \partial_t u \rangle}$ of the Gram-type matrix ${G[u,u]}$, but fortunately this component is not used in any of the conservation laws and so the loss of this component does not cause any difficulty.

After applying the above-mentioned Nash-embedding theorem, the task is now to locate a matrix ${G}$ obeying all the hypotheses of that theorem, as well as the conservation laws for mass, momentum, and energy (after defining the potential energy field ${V}$ in terms of ${G}$). This is quite a lot of fields and constraints, but one can cut down significantly on the degrees of freedom by requiring that ${G}$ is spherically symmetric (in a tensorial sense) and also continuously self-similar (not just discretely self-similar). Note that this hypothesis is weaker than the assertion that the original field ${u}$ is spherically symmetric and continuously self-similar; indeed we do not know if non-trivial solutions of this type actually exist. These symmetry hypotheses reduce the number of independent components of the ${(2d+4) \times (2d+4)}$ matrix ${G}$ to just six: ${g_{1,1}, g_{1,i\partial_t}, g_{1,i\partial_r}, g_{\partial_r, \partial_r}, g_{\partial_\omega, \partial_\omega}, g_{\partial_r, \partial_t}}$, which now take as their domain the ${1+1}$-dimensional space

$\displaystyle H_1 := ([0,+\infty) \times {\bf R}) \backslash \{(0,0)\}.$

One now has to construct these six fields, together with a potential energy field ${v}$, that obey a number of constraints, notably some positive definiteness constraints as well as the aforementioned conservation laws for mass, momentum, and energy.

The field ${g_{1,i\partial_t}}$ only arises in the equation for the potential ${v}$ (coming from Euler’s identity) and can easily be eliminated. Similarly, the field ${g_{\partial_r,\partial_t}}$ only makes an appearance in the current of the energy conservation law, and so can also be easily eliminated so long as the total energy is conserved. But in the energy-supercritical case, the total energy is infinite, and so it is relatively easy to eliminate the field ${g_{\partial_r, \partial_t}}$ from the problem also. This leaves us with the task of constructing just five fields ${g_{1,1}, g_{1,i\partial_r}, g_{\partial_r,\partial_r}, g_{\partial_\omega,\partial_\omega}, v}$ obeying a number of positivity conditions, symmetry conditions, regularity conditions, and conservation laws for mass and momentum.

The potential field ${v}$ can effectively be absorbed into the angular stress field ${g_{\partial_\omega,\partial_\omega}}$ (after placing an appropriate counterbalancing term in the radial stress field ${g_{\partial_r, \partial_r}}$ so as not to disrupt the conservation laws), so we can also eliminate this field. The angular stress field ${g_{\partial_\omega, \partial_\omega}}$ is then only constrained through the momentum conservation law and a requirement of positivity; one can then eliminate this field by converting the momentum conservation law from an equality to an inequality. Finally, the radial stress field ${g_{\partial_r, \partial_r}}$ is also only constrained through a positive definiteness constraint and the momentum conservation inequality, so it can also be eliminated from the problem after some further modification of the momentum conservation inequality.

The task then reduces to locating just two fields ${g_{1,1}, g_{1,i\partial_r}}$ that obey a mass conservation law

$\displaystyle \partial_t g_{1,1} = 2 \left(\partial_r + \frac{d-1}{r} \right) g_{1,i\partial r}$

together with an additional inequality that is the remnant of the momentum conservation law. One can solve for the mass conservation law in terms of a single scalar field ${W}$ using the ansatz

$\displaystyle g_{1,1} = 2 r^{1-d} \partial_r (r^d W)$

$\displaystyle g_{1,i\partial_r} = r^{1-d} \partial_t (r^d W)$

so the problem has finally been simplified to the task of locating a single scalar field ${W}$ with some scaling and homogeneity properties that obeys a certain differential inequality relating to momentum conservation. This turns out to be possible by explicitly writing down a specific scalar field ${W}$ using some asymptotic parameters and cutoff functions.

I’ve just uploaded to the arXiv my paper “An integration approach to the Toeplitz square peg problem“, submitted to Forum of Mathematics, Sigma. This paper resulted from my attempts recently to solve the Toeplitz square peg problem (also known as the inscribed square problem):

Conjecture 1 (Toeplitz square peg problem) Let ${\gamma}$ be a simple closed curve in the plane. Is it necessarily the case that ${\gamma}$ contains four vertices of a square?

See this recent survey of Matschke in the Notices of the AMS for the latest results on this problem.

The route I took to the results in this paper was somewhat convoluted. I was motivated to look at this problem after lecturing recently on the Jordan curve theorem in my class. The problem is superficially similar to the Jordan curve theorem in that the result is known (and rather easy to prove) if ${\gamma}$ is sufficiently regular (e.g. if it is a polygonal path), but seems to be significantly more difficult when the curve is merely assumed to be continuous. Roughly speaking, all the known positive results on the problem have proceeded using (in some form or another) tools from homology: note for instance that one can view the conjecture as asking whether the four-dimensional subset ${\gamma^4}$ of the eight-dimensional space ${({\bf R}^2)^4}$ necessarily intersects the four-dimensional space ${\mathtt{Squares} \subset ({\bf R}^2)^4}$ consisting of the quadruples ${(v_1,v_2,v_3,v_4)}$ traversing a square in (say) anti-clockwise order; this space is a four-dimensional linear subspace of ${({\bf R}^2)^4}$, with a two-dimensional subspace of “degenerate” squares ${(v,v,v,v)}$ removed. If one ignores this degenerate subspace, one can use intersection theory to conclude (under reasonable “transversality” hypotheses) that ${\gamma^4}$ intersects ${\mathtt{Squares}}$ an odd number of times (up to the cyclic symmetries of the square), which is basically how Conjecture 1 is proven in the regular case. Unfortunately, if one then takes a limit and considers what happens when ${\gamma}$ is just a continuous curve, the odd number of squares created by these homological arguments could conceivably all degenerate to points, thus blocking one from proving the conjecture in the general case.

Inspired by my previous work on finite time blowup for various PDEs, I first tried looking for a counterexample in the category of (locally) self-similar curves that are smooth (or piecewise linear) away from a single origin where it can oscillate infinitely often; this is basically the smoothest type of curve that was not already covered by previous results. By a rescaling and compactness argument, it is not difficult to see that such a counterexample would exist if there was a counterexample to the following periodic version of the conjecture:

Conjecture 2 (Periodic square peg problem) Let ${\gamma_1, \gamma_2}$ be two disjoint simple closed piecewise linear curves in the cylinder ${({\bf R}/{\bf Z}) \times {\bf R}}$ which have a winding number of one, that is to say they are homologous to the loop ${x \mapsto (x,0)}$ from ${{\bf R}/{\bf Z}}$ to ${({\bf R}/{\bf Z}) \times {\bf R}}$. Then the union of ${\gamma_1}$ and ${\gamma_2}$ contains the four vertices of a square.

In contrast to Conjecture 1, which is known for polygonal paths, Conjecture 2 is still open even under the hypothesis of polygonal paths; the homological arguments alluded to previously now show that the number of inscribed squares in the periodic setting is even rather than odd, which is not enough to conclude the conjecture. (This flipping of parity from odd to even due to an infinite amount of oscillation is reminiscent of the “Eilenberg-Mazur swindle“, discussed in this previous post.)

I therefore tried to construct counterexamples to Conjecture 2. I began perturbatively, looking at curves ${\gamma_1, \gamma_2}$ that were small perturbations of constant functions. After some initial Taylor expansion, I was blocked from forming such a counterexample because an inspection of the leading Taylor coefficients required one to construct a continuous periodic function of mean zero that never vanished, which of course was impossible by the intermediate value theorem. I kept expanding to higher and higher order to try to evade this obstruction (this, incidentally, was when I discovered this cute application of Lagrange reversion) but no matter how high an accuracy I went (I think I ended up expanding to sixth order in a perturbative parameter ${\varepsilon}$ before figuring out what was going on!), this obstruction kept resurfacing again and again. I eventually figured out that this obstruction was being caused by a “conserved integral of motion” for both Conjecture 2 and Conjecture 1, which can in fact be used to largely rule out perturbative constructions. This yielded a new positive result for both conjectures:

Theorem 3

• (i) Conjecture 1 holds when ${\gamma}$ is the union ${\{ (t,f(t)): t \in [t_0,t_1]\} \cup \{ (t,g(t)): t \in [t_0,t_1]\}}$ of the graphs of two Lipschitz functions ${f,g: [t_0,t_1] \rightarrow {\bf R}}$ of Lipschitz constant less than one that agree at the endpoints.
• (ii) Conjecture 2 holds when ${\gamma_1, \gamma_2}$ are graphs of Lipschitz functions ${f: {\bf R}/{\bf Z} \rightarrow {\bf R}, g: {\bf R}/{\bf Z} \rightarrow {\bf R}}$ of Lipschitz constant less than one.

We sketch the proof of Theorem 3(i) as follows (the proof of Theorem 3(ii) is very similar). Let ${\gamma_1: [t_0, t_1] \rightarrow {\bf R}}$ be the curve ${\gamma_1(t) := (t,f(t))}$, thus ${\gamma_1}$ traverses one of the two graphs that comprise ${\gamma}$. For each time ${t \in [t_0,t_1]}$, there is a unique square with first vertex ${\gamma_1(t)}$ (and the other three vertices, traversed in anticlockwise order, denoted ${\gamma_2(t), \gamma_3(t), \gamma_4(t)}$) such that ${\gamma_2(t)}$ also lies in the graph of ${f}$ and ${\gamma_4(t)}$ also lies in the graph of ${g}$ (actually for technical reasons we have to extend ${f,g}$ by constants to all of ${{\bf R}}$ in order for this claim to be true). To see this, we simply rotate the graph of ${g}$ clockwise by ${\frac{\pi}{2}}$ around ${\gamma_1(t)}$, where (by the Lipschitz hypotheses) it must hit the graph of ${f}$ in a unique point, which is ${\gamma_2(t)}$, and which then determines the other two vertices ${\gamma_3(t), \gamma_4(t)}$ of the square. The curve ${\gamma_3(t)}$ has the same starting and ending point as the graph of ${f}$ or ${g}$; using the Lipschitz hypothesis one can show this graph is simple. If the curve ever hits the graph of ${g}$ other than at the endpoints, we have created an inscribed square, so we may assume for contradiction that ${\gamma_3(t)}$ avoids the graph of ${g}$, and hence by the Jordan curve theorem the two curves enclose some non-empty bounded open region ${\Omega}$.

Now for the conserved integral of motion. If we integrate the ${1}$-form ${y\ dx}$ on each of the four curves ${\gamma_1, \gamma_2, \gamma_3, \gamma_4}$, we obtain the identity

$\displaystyle \int_{\gamma_1} y\ dx - \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx - \int_{\gamma_4} y\ dx = 0.$

This identity can be established by the following calculation: one can parameterise

$\displaystyle \gamma_1(t) = (x(t), y(t))$

$\displaystyle \gamma_2(t) = (x(t)+a(t), y(t)+b(t))$

$\displaystyle \gamma_3(t) = (x(t)+a(t)-b(t), y(t)+a(t)+b(t))$

$\displaystyle \gamma_4(t) = (x(t)-b(t), y(t)+a(t))$

for some Lipschitz functions ${x,y,a,b: [t_0,t_1] \rightarrow {\bf R}}$; thus for instance ${\int_{\gamma_1} y\ dx = \int_{t_0}^{t_1} y(t)\ dx(t)}$. Inserting these parameterisations and doing some canceling, one can write the above integral as

$\displaystyle \int_{t_0}^{t_1} d \frac{a(t)^2-b(t)^2}{2}$

which vanishes because ${a(t), b(t)}$ (which represent the sidelengths of the squares determined by ${\gamma_1(t), \gamma_2(t), \gamma_3(t), \gamma_4(t)}$ vanish at the endpoints ${t=t_0,t_1}$.

Using this conserved integral of motion, one can show that

$\displaystyle \int_{\gamma_3} y\ dx = \int_{t_0}^{t_1} g(t)\ dt$

which by Stokes’ theorem then implies that the bounded open region ${\Omega}$ mentioned previously has zero area, which is absurd.

This argument hinged on the curve ${\gamma_3}$ being simple, so that the Jordan curve theorem could apply. Once one left the perturbative regime of curves of small Lipschitz constant, it became possible for ${\gamma_3}$ to be self-crossing, but nevertheless there still seemed to be some sort of integral obstruction. I eventually isolated the problem in the form of a strengthened version of Conjecture 2:

Conjecture 4 (Area formulation of square peg problem) Let ${\gamma_1, \gamma_2, \gamma_3, \gamma_4: {\bf R}/{\bf Z} \rightarrow ({\bf R}/{\bf Z}) \times {\bf R}}$ be simple closed piecewise linear curves of winding number ${1}$ obeying the area identity

$\displaystyle \int_{\gamma_1} y\ dx - \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx - \int_{\gamma_4} y\ dx = 0$

(note the ${1}$-form ${y\ dx}$ is still well defined on the cylinder ${({\bf R}/{\bf Z}) \times {\bf R}}$; note also that the curves ${\gamma_1,\gamma_2,\gamma_3,\gamma_4}$ are allowed to cross each other.) Then there exists a (possibly degenerate) square with vertices (traversed in anticlockwise order) lying on ${\gamma_1, \gamma_2, \gamma_3, \gamma_4}$ respectively.

It is not difficult to see that Conjecture 4 implies Conjecture 2. Actually I believe that the converse implication is at least morally true, in that any counterexample to Conjecture 4 can be eventually transformed to a counterexample to Conjecture 2 and Conjecture 1. The conserved integral of motion argument can establish Conjecture 4 in many cases, for instance if ${\gamma_2,\gamma_4}$ are graphs of functions of Lipschitz constant less than one.

Conjecture 4 has a model special case, when one of the ${\gamma_i}$ is assumed to just be a horizontal loop. In this case, the problem collapses to that of producing an intersection between two three-dimensional subsets of a six-dimensional space, rather than to four-dimensional subsets of an eight-dimensional space. More precisely, some elementary transformations reveal that this special case of Conjecture 4 can be formulated in the following fashion in which the geometric notion of a square is replaced by the additive notion of a triple of real numbers summing to zero:

Conjecture 5 (Special case of area formulation) Let ${\gamma_1, \gamma_2, \gamma_3: {\bf R}/{\bf Z} \rightarrow ({\bf R}/{\bf Z}) \times {\bf R}}$ be simple closed piecewise linear curves of winding number ${1}$ obeying the area identity

$\displaystyle \int_{\gamma_1} y\ dx + \int_{\gamma_2} y\ dx + \int_{\gamma_3} y\ dx = 0.$

Then there exist ${x \in {\bf R}/{\bf Z}}$ and ${y_1,y_2,y_3 \in {\bf R}}$ with ${y_1+y_2+y_3=0}$ such that ${(x,y_i) \in \gamma_i}$ for ${i=1,2,3}$.

This conjecture is easy to establish if one of the curves, say ${\gamma_3}$, is the graph ${\{ (t,f(t)): t \in {\bf R}/{\bf Z}\}}$ of some piecewise linear function ${f: {\bf R}/{\bf Z} \rightarrow {\bf R}}$, since in that case the curve ${\gamma_1}$ and the curve ${\tilde \gamma_2 := \{ (x, -y-f(x)): (x,y) \in \gamma_2 \}}$ enclose the same area in the sense that ${\int_{\gamma_1} y\ dx = \int_{\tilde \gamma_2} y\ dx}$, and hence must intersect by the Jordan curve theorem (otherwise they would enclose a non-zero amount of area between them), giving the claim. But when none of the ${\gamma_1,\gamma_2,\gamma_3}$ are graphs, the situation becomes combinatorially more complicated.

Using some elementary homological arguments (e.g. breaking up closed ${1}$-cycles into closed paths) and working with a generic horizontal slice of the curves, I was able to show that Conjecture 5 was equivalent to a one-dimensional problem that was largely combinatorial in nature, revolving around the sign patterns of various triple sums ${y_{1,a} + y_{2,b} + y_{3,c}}$ with ${y_{1,a}, y_{2,b}, y_{3,c}}$ drawn from various finite sets of reals.

Conjecture 6 (Combinatorial form) Let ${k_1,k_2,k_3}$ be odd natural numbers, and for each ${i=1,2,3}$, let ${y_{i,1},\dots,y_{i,k_i}}$ be distinct real numbers; we adopt the convention that ${y_{i,0}=y_{i,k_i+1}=-\infty}$. Assume the following axioms:

• (i) For any ${1 \leq p \leq k_1, 1 \leq q \leq k_2, 1 \leq r \leq k_3}$, the sums ${y_{1,p} + y_{2,q} + y_{3,r}}$ are non-zero.
• (ii) (Non-crossing) For any ${i=1,2,3}$ and ${0 \leq p < q \leq k_i}$ with the same parity, the pairs ${\{ y_{i,p}, y_{i,p+1}\}}$ and ${\{y_{i,q}, y_{i,q+1}\}}$ are non-crossing in the sense that

$\displaystyle \sum_{a \in \{p,p+1\}} \sum_{b \in \{q,q+1\}} (-1)^{a+b} \mathrm{sgn}( y_{i,a} - y_{i,b} ) = 0.$

• (iii) (Non-crossing sums) For any ${0 \leq p \leq k_1}$, ${0 \leq q \leq k_2}$, ${0 \leq r \leq k_3}$ of the same parity, one has

$\displaystyle \sum_{a \in \{p,p+1\}} \sum_{b \in \{q,q+1\}} \sum_{c \in \{r,r+1\}} (-1)^{a+b+c} \mathrm{sgn}( y_{1,a} + y_{2,b} + y_{3,c} ) = 0.$

Then one has

$\displaystyle \sum_{i=1}^3 \sum_{p=1}^{k_i} (-1)^{p-1} y_{i,p} < 0.$

Roughly speaking, Conjecture 6 and Conjecture 5 are connected by constructing curves ${\gamma_i}$ to connect ${(0, y_{i,p})}$ to ${(0,y_{i,p+1})}$ for ${0 \leq p \leq k+1}$ by various paths, which either lie to the right of the ${y}$ axis (when ${p}$ is odd) or to the left of the ${y}$ axis (when ${p}$ is even). The axiom (ii) is asserting that the numbers ${-\infty, y_{i,1},\dots,y_{i,k_i}}$ are ordered according to the permutation of a meander (formed by gluing together two non-crossing perfect matchings).

Using various ad hoc arguments involving “winding numbers”, it is possible to prove this conjecture in many cases (e.g. if one of the ${k_i}$ is at most ${3}$), to the extent that I have now become confident that this conjecture is true (and have now come full circle from trying to disprove Conjecture 1 to now believing that this conjecture holds also). But it seems that there is some non-trivial combinatorial argument to be made if one is to prove this conjecture; purely homological arguments seem to partially resolve the problem, but are not sufficient by themselves.

While I was not able to resolve the square peg problem, I think these results do provide a roadmap to attacking it, first by focusing on the combinatorial conjecture in Conjecture 6 (or its equivalent form in Conjecture 5), then after that is resolved moving on to Conjecture 4, and then finally to Conjecture 1.

Fifteen years ago, I wrote a paper entitled Global regularity of wave maps. II. Small energy in two dimensions, in which I established global regularity of wave maps from two spatial dimensions to the unit sphere, assuming that the initial data had small energy. Recently, Hao Jia (personal communication) discovered a small gap in the argument that requires a slightly non-trivial fix. The issue does not really affect the subsequent literature, because the main result has since been reproven and extended by methods that avoid the gap (see in particular this subsequent paper of Tataru), but I have decided to describe the gap and its fix on this blog.

I will assume familiarity with the notation of my paper. In Section 10, some complicated spaces ${S[k] = S[k]({\bf R}^{1+n})}$ are constructed for each frequency scale ${k}$, and then a further space ${S(c) = S(c)({\bf R}^{1+n})}$ is constructed for a given frequency envelope ${c}$ by the formula

$\displaystyle \| \phi \|_{S(c)({\bf R}^{1+n})} := \|\phi \|_{L^\infty_t L^\infty_x({\bf R}^{1+n})} + \sup_k c_k^{-1} \| \phi_k \|_{S[k]({\bf R}^{1+n})} \ \ \ \ \ (1)$

where ${\phi_k := P_k \phi}$ is the Littlewood-Paley projection of ${\phi}$ to frequency magnitudes ${\sim 2^k}$. Then, given a spacetime slab ${[-T,T] \times {\bf R}^n}$, we define the restrictions

$\displaystyle \| \phi \|_{S(c)([-T,T] \times {\bf R}^n)} := \inf \{ \| \tilde \phi \|_{S(c)({\bf R}^{1+n})}: \tilde \phi \downharpoonright_{[-T,T] \times {\bf R}^n} = \phi \}$

where the infimum is taken over all extensions ${\tilde \phi}$ of ${\phi}$ to the Minkowski spacetime ${{\bf R}^{1+n}}$; similarly one defines

$\displaystyle \| \phi_k \|_{S_k([-T,T] \times {\bf R}^n)} := \inf \{ \| \tilde \phi_k \|_{S_k({\bf R}^{1+n})}: \tilde \phi_k \downharpoonright_{[-T,T] \times {\bf R}^n} = \phi_k \}.$

The gap in the paper is as follows: it was implicitly assumed that one could restrict (1) to the slab ${[-T,T] \times {\bf R}^n}$ to obtain the equality

$\displaystyle \| \phi \|_{S(c)([-T,T] \times {\bf R}^n)} = \|\phi \|_{L^\infty_t L^\infty_x([-T,T] \times {\bf R}^n)} + \sup_k c_k^{-1} \| \phi_k \|_{S[k]([-T,T] \times {\bf R}^n)}.$

(This equality is implicitly used to establish the bound (36) in the paper.) Unfortunately, (1) only gives the lower bound, not the upper bound, and it is the upper bound which is needed here. The problem is that the extensions ${\tilde \phi_k}$ of ${\phi_k}$ that are optimal for computing ${\| \phi_k \|_{S[k]([-T,T] \times {\bf R}^n)}}$ are not necessarily the Littlewood-Paley projections of the extensions ${\tilde \phi}$ of ${\phi}$ that are optimal for computing ${\| \phi \|_{S(c)([-T,T] \times {\bf R}^n)}}$.

To remedy the problem, one has to prove an upper bound of the form

$\displaystyle \| \phi \|_{S(c)([-T,T] \times {\bf R}^n)} \lesssim \|\phi \|_{L^\infty_t L^\infty_x([-T,T] \times {\bf R}^n)} + \sup_k c_k^{-1} \| \phi_k \|_{S[k]([-T,T] \times {\bf R}^n)}$

for all Schwartz ${\phi}$ (actually we need affinely Schwartz ${\phi}$, but one can easily normalise to the Schwartz case). Without loss of generality we may normalise the RHS to be ${1}$. Thus

$\displaystyle \|\phi \|_{L^\infty_t L^\infty_x([-T,T] \times {\bf R}^n)} \leq 1 \ \ \ \ \ (2)$

and

$\displaystyle \|P_k \phi \|_{S[k]([-T,T] \times {\bf R}^n)} \leq c_k \ \ \ \ \ (3)$

for each ${k}$, and one has to find a single extension ${\tilde \phi}$ of ${\phi}$ such that

$\displaystyle \|\tilde \phi \|_{L^\infty_t L^\infty_x({\bf R}^{1+n})} \lesssim 1 \ \ \ \ \ (4)$

and

$\displaystyle \|P_k \tilde \phi \|_{S[k]({\bf R}^{1+n})} \lesssim c_k \ \ \ \ \ (5)$

for each ${k}$. Achieving a ${\tilde \phi}$ that obeys (4) is trivial (just extend ${\phi}$ by zero), but such extensions do not necessarily obey (5). On the other hand, from (3) we can find extensions ${\tilde \phi_k}$ of ${P_k \phi}$ such that

$\displaystyle \|\tilde \phi_k \|_{S[k]({\bf R}^{1+n})} \lesssim c_k; \ \ \ \ \ (6)$

the extension ${\tilde \phi := \sum_k \tilde \phi_k}$ will then obey (5) (here we use Lemma 9 from my paper), but unfortunately is not guaranteed to obey (4) (the ${S[k]}$ norm does control the ${L^\infty_t L^\infty_x}$ norm, but a key point about frequency envelopes for the small energy regularity problem is that the coefficients ${c_k}$, while bounded, are not necessarily summable).

This can be fixed as follows. For each ${k}$ we introduce a time cutoff ${\eta_k}$ supported on ${[-T-2^{-k}, T+2^{-k}]}$ that equals ${1}$ on ${[-T-2^{-k-1},T+2^{-k+1}]}$ and obeys the usual derivative estimates in between (the ${j^{th}}$ time derivative of size ${O_j(2^{jk})}$ for each ${j}$). Later we will prove the truncation estimate

$\displaystyle \| \eta_k \tilde \phi_k \|_{S[k]({\bf R}^{1+n})} \lesssim \| \tilde \phi_k \|_{S[k]({\bf R}^{1+n})}. \ \ \ \ \ (7)$

Assuming this estimate, then if we set ${\tilde \phi := \sum_k \eta_k \tilde \phi_k}$, then using Lemma 9 in my paper and (6), (7) (and the local stability of frequency envelopes) we have the required property (5). (There is a technical issue arising from the fact that ${\tilde \phi}$ is not necessarily Schwartz due to slow decay at temporal infinity, but by considering partial sums in the ${k}$ summation and taking limits we can check that ${\tilde \phi}$ is the strong limit of Schwartz functions, which suffices here; we omit the details for sake of exposition.) So the only issue is to establish (4), that is to say that

$\displaystyle \| \sum_k \eta_k(t) \tilde \phi_k(t) \|_{L^\infty_x({\bf R}^n)} \lesssim 1$

for all ${t \in {\bf R}}$.

For ${t \in [-T,T]}$ this is immediate from (2). Now suppose that ${t \in [T+2^{k_0-1}, T+2^{k_0}]}$ for some integer ${k_0}$ (the case when ${t \in [-T-2^{k_0}, -T-2^{k_0-1}]}$ is treated similarly). Then we can split

$\displaystyle \sum_k \eta_k(t) \tilde \phi_k(t) = \Phi_1 + \Phi_2 + \Phi_3$

where

$\displaystyle \Phi_1 := \sum_{k < k_0} \tilde \phi_k(T)$

$\displaystyle \Phi_2 := \sum_{k < k_0} \tilde \phi_k(t) - \tilde \phi_k(T)$

$\displaystyle \Phi_3 := \eta_{k_0}(t) \tilde \phi_{k_0}(t).$

The contribution of the ${\Phi_3}$ term is acceptable by (6) and estimate (82) from my paper. The term ${\Phi_1}$ sums to ${P_{ which is acceptable by (2). So it remains to control the ${L^\infty_x}$ norm of ${\Phi_2}$. By the triangle inequality and the fundamental theorem of calculus, we can bound

$\displaystyle \| \Phi_2 \|_{L^\infty_x} \leq (t-T) \sum_{k < k_0} \| \partial_t \tilde \phi_k \|_{L^\infty_t L^\infty_x({\bf R}^{1+n})}.$

By hypothesis, ${t-T \leq 2^{-k_0}}$. Using the first term in (79) of my paper and Bernstein’s inequality followed by (6) we have

$\displaystyle \| \partial_t \tilde \phi_k \|_{L^\infty_t L^\infty_x({\bf R}^{1+n})} \lesssim 2^k \| \tilde \phi_k \|_{S[k]({\bf R}^{1+n})} \lesssim 2^k;$

and then we are done by summing the geometric series in ${k}$.

It remains to prove the truncation estimate (7). This estimate is similar in spirit to the algebra estimates already in my paper, but unfortunately does not seem to follow immediately from these estimates as written, and so one has to repeat the somewhat lengthy decompositions and case checkings used to prove these estimates. We do this below the fold.

I’ve just posted to the arXiv my paper “Finite time blowup for Lagrangian modifications of the three-dimensional Euler equation“. This paper is loosely in the spirit of other recent papers of mine in which I explore how close one can get to supercritical PDE of physical interest (such as the Euler and Navier-Stokes equations), while still being able to rigorously demonstrate finite time blowup for at least some choices of initial data. Here, the PDE we are trying to get close to is the incompressible inviscid Euler equations

$\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p$

$\displaystyle \nabla \cdot u = 0$

in three spatial dimensions, where ${u}$ is the velocity vector field and ${p}$ is the pressure field. In vorticity form, and viewing the vorticity ${\omega}$ as a ${2}$-form (rather than a vector), we can rewrite this system using the language of differential geometry as

$\displaystyle \partial_t \omega + {\mathcal L}_u \omega = 0$

$\displaystyle u = \delta \tilde \eta^{-1} \Delta^{-1} \omega$

where ${{\mathcal L}_u}$ is the Lie derivative along ${u}$, ${\delta}$ is the codifferential (the adjoint of the differential ${d}$, or equivalently the negative of the divergence operator) that sends ${k+1}$-vector fields to ${k}$-vector fields, ${\Delta}$ is the Hodge Laplacian, and ${\tilde \eta}$ is the identification of ${k}$-vector fields with ${k}$-forms induced by the Euclidean metric ${\tilde \eta}$. The equation${u = \delta \tilde \eta^{-1} \Delta^{-1} \omega}$ can be viewed as the Biot-Savart law recovering velocity from vorticity, expressed in the language of differential geometry.

One can then generalise this system by replacing the operator ${\tilde \eta^{-1} \Delta^{-1}}$ by a more general operator ${A}$ from ${2}$-forms to ${2}$-vector fields, giving rise to what I call the generalised Euler equations

$\displaystyle \partial_t \omega + {\mathcal L}_u \omega = 0$

$\displaystyle u = \delta A \omega.$

For example, the surface quasi-geostrophic (SQG) equations can be written in this form, as discussed in this previous post. One can view ${A \omega}$ (up to Hodge duality) as a vector potential for the velocity ${u}$, so it is natural to refer to ${A}$ as a vector potential operator.

The generalised Euler equations carry much of the same geometric structure as the true Euler equations. For instance, the transport equation ${\partial_t \omega + {\mathcal L}_u \omega = 0}$ is equivalent to the Kelvin circulation theorem, which in three dimensions also implies the transport of vortex streamlines and the conservation of helicity. If ${A}$ is self-adjoint and positive definite, then the famous Euler-Poincaré interpretation of the true Euler equations as geodesic flow on an infinite dimensional Riemannian manifold of volume preserving diffeomorphisms (as discussed in this previous post) extends to the generalised Euler equations (with the operator ${A}$ determining the new Riemannian metric to place on this manifold). In particular, the generalised Euler equations have a Lagrangian formulation, and so by Noether’s theorem we expect any continuous symmetry of the Lagrangian to lead to conserved quantities. Indeed, we have a conserved Hamiltonian ${\frac{1}{2} \int \langle \omega, A \omega \rangle}$, and any spatial symmetry of ${A}$ leads to a conserved impulse (e.g. translation invariance leads to a conserved momentum, and rotation invariance leads to a conserved angular momentum). If ${A}$ behaves like a pseudodifferential operator of order ${-2}$ (as is the case with the true vector potential operator ${\tilde \eta^{-1} \Delta^{-1}}$), then it turns out that one can use energy methods to recover the same sort of classical local existence theory as for the true Euler equations (up to and including the famous Beale-Kato-Majda criterion for blowup).

The true Euler equations are suspected of admitting smooth localised solutions which blow up in finite time; there is now substantial numerical evidence for this blowup, but it has not been proven rigorously. The main purpose of this paper is to show that such finite time blowup can at least be established for certain generalised Euler equations that are somewhat close to the true Euler equations. This is similar in spirit to my previous paper on finite time blowup on averaged Navier-Stokes equations, with the main new feature here being that the modified equation continues to have a Lagrangian structure and a vorticity formulation, which was not the case with the averaged Navier-Stokes equation. On the other hand, the arguments here are not able to handle the presence of viscosity (basically because they rely crucially on the Kelvin circulation theorem, which is not available in the viscous case).

In fact, three different blowup constructions are presented (for three different choices of vector potential operator ${A}$). The first is a variant of one discussed previously on this blog, in which a “neck pinch” singularity for a vortex tube is created by using a non-self-adjoint vector potential operator, in which the velocity at the neck of the vortex tube is determined by the circulation of the vorticity somewhat further away from that neck, which when combined with conservation of circulation is enough to guarantee finite time blowup. This is a relatively easy construction of finite time blowup, and has the advantage of being rather stable (any initial data flowing through a narrow tube with a large positive circulation will blow up in finite time). On the other hand, it is not so surprising in the non-self-adjoint case that finite blowup can occur, as there is no conserved energy.

The second blowup construction is based on a connection between the two-dimensional SQG equation and the three-dimensional generalised Euler equations, discussed in this previous post. Namely, any solution to the former can be lifted to a “two and a half-dimensional” solution to the latter, in which the velocity and vorticity are translation-invariant in the vertical direction (but the velocity is still allowed to contain vertical components, so the flow is not completely horizontal). The same embedding also works to lift solutions to generalised SQG equations in two dimensions to solutions to generalised Euler equations in three dimensions. Conveniently, even if the vector potential operator for the generalised SQG equation fails to be self-adjoint, one can ensure that the three-dimensional vector potential operator is self-adjoint. Using this trick, together with a two-dimensional version of the first blowup construction, one can then construct a generalised Euler equation in three dimensions with a vector potential that is both self-adjoint and positive definite, and still admits solutions that blow up in finite time, though now the blowup is now a vortex sheet creasing at on a line, rather than a vortex tube pinching at a point.

This eliminates the main defect of the first blowup construction, but introduces two others. Firstly, the blowup is less stable, as it relies crucially on the initial data being translation-invariant in the vertical direction. Secondly, the solution is not spatially localised in the vertical direction (though it can be viewed as a compactly supported solution on the manifold ${{\bf R}^2 \times {\bf R}/{\bf Z}}$, rather than ${{\bf R}^3}$). The third and final blowup construction of the paper addresses the final defect, by replacing vertical translation symmetry with axial rotation symmetry around the vertical axis (basically, replacing Cartesian coordinates with cylindrical coordinates). It turns out that there is a more complicated way to embed two-dimensional generalised SQG equations into three-dimensional generalised Euler equations in which the solutions to the latter are now axially symmetric (but are allowed to “swirl” in the sense that the velocity field can have a non-zero angular component), while still keeping the vector potential operator self-adjoint and positive definite; the blowup is now that of a vortex ring creasing on a circle.

As with the previous papers in this series, these blowup constructions do not directly imply finite time blowup for the true Euler equations, but they do at least provide a barrier to establishing global regularity for these latter equations, in that one is forced to use some property of the true Euler equations that are not shared by these generalisations. They also suggest some possible blowup mechanisms for the true Euler equations (although unfortunately these mechanisms do not seem compatible with the addition of viscosity, so they do not seem to suggest a viable Navier-Stokes blowup mechanism).

I’ve just uploaded to the arXiv my paper “Equivalence of the logarithmically averaged Chowla and Sarnak conjectures“, submitted to the Festschrift “Number Theory – Diophantine problems, uniform distribution and applications” in honour of Robert F. Tichy. This paper is a spinoff of my previous paper establishing a logarithmically averaged version of the Chowla (and Elliott) conjectures in the two-point case. In that paper, the estimate

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+h)}{n} = o( \log x )$

as ${x \rightarrow \infty}$ was demonstrated, where ${h}$ was any positive integer and ${\lambda}$ denoted the Liouville function. The proof proceeded using a method I call the “entropy decrement argument”, which ultimately reduced matters to establishing a bound of the form

$\displaystyle \sum_{n \leq x} \frac{|\sum_{h \leq H} \lambda(n+h) e( \alpha h)|}{n} = o( H \log x )$

whenever ${H}$ was a slowly growing function of ${x}$. This was in turn established in a previous paper of Matomaki, Radziwill, and myself, using the recent breakthrough of Matomaki and Radziwill.

It is natural to see to what extent the arguments can be adapted to attack the higher-point cases of the logarithmically averaged Chowla conjecture (ignoring for this post the more general Elliott conjecture for other bounded multiplicative functions than the Liouville function). That is to say, one would like to prove that

$\displaystyle \sum_{n \leq x} \frac{\lambda(n+h_1) \dots \lambda(n+h_k)}{n} = o( \log x )$

as ${x \rightarrow \infty}$ for any fixed distinct integers ${h_1,\dots,h_k}$. As it turns out (and as is detailed in the current paper), the entropy decrement argument extends to this setting (after using some known facts about linear equations in primes), and allows one to reduce the above estimate to an estimate of the form

$\displaystyle \sum_{n \leq x} \frac{1}{n} \| \lambda \|_{U^d[n, n+H]} = o( \log x )$

for ${H}$ a slowly growing function of ${x}$ and some fixed ${d}$ (in fact we can take ${d=k-1}$ for ${k \geq 3}$), where ${U^d}$ is the (normalised) local Gowers uniformity norm. (In the case ${k=3}$, ${d=2}$, this becomes the Fourier-uniformity conjecture discussed in this previous post.) If one then applied the (now proven) inverse conjecture for the Gowers norms, this estimate is in turn equivalent to the more complicated looking assertion

$\displaystyle \sum_{n \leq x} \frac{1}{n} \sup |\sum_{h \leq H} \lambda(n+h) F( g^h x )| = o( \log x ) \ \ \ \ \ (1)$

where the supremum is over all possible choices of nilsequences ${h \mapsto F(g^h x)}$ of controlled step and complexity (see the paper for definitions of these terms).

The main novelty in the paper (elaborating upon a previous comment I had made on this blog) is to observe that this latter estimate in turn follows from the logarithmically averaged form of Sarnak’s conjecture (discussed in this previous post), namely that

$\displaystyle \sum_{n \leq x} \frac{1}{n} \lambda(n) F( T^n x )= o( \log x )$

whenever ${n \mapsto F(T^n x)}$ is a zero entropy (i.e. deterministic) sequence. Morally speaking, this follows from the well-known fact that nilsequences have zero entropy, but the presence of the supremum in (1) means that we need a little bit more; roughly speaking, we need the class of nilsequences of a given step and complexity to have “uniformly zero entropy” in some sense.

On the other hand, it was already known (see previous post) that the Chowla conjecture implied the Sarnak conjecture, and similarly for the logarithmically averaged form of the two conjectures. Putting all these implications together, we obtain the pleasant fact that the logarithmically averaged Sarnak and Chowla conjectures are equivalent, which is the main result of the current paper. There have been a large number of special cases of the Sarnak conjecture worked out (when the deterministic sequence involved came from a special dynamical system), so these results can now also be viewed as partial progress towards the Chowla conjecture also (at least with logarithmic averaging). However, my feeling is that the full resolution of these conjectures will not come from these sorts of special cases; instead, conjectures like the Fourier-uniformity conjecture in this previous post look more promising to attack.

It would also be nice to get rid of the pesky logarithmic averaging, but this seems to be an inherent requirement of the entropy decrement argument method, so one would probably have to find a way to avoid that argument if one were to remove the log averaging.