You are currently browsing the monthly archive for December 2017.

Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions II. Divisor correlations in short ranges“. This is a sequel of sorts to our previous paper on divisor correlations, though the proof techniques in this paper are rather different. As with the previous paper, our interest is in correlations such as

$\displaystyle \sum_{n \leq X} d_k(n) d_l(n+h) \ \ \ \ \ (1)$

for medium-sized ${h}$ and large ${X}$, where ${k \geq l \geq 1}$ are natural numbers and ${d_k(n) = \sum_{n = m_1 \dots m_k} 1}$ is the ${k^{th}}$ divisor function (actually our methods can also treat a generalisation in which ${k}$ is non-integer, but for simplicity let us stick with the integer case for this discussion). Our methods also allow for one of the divisor function factors to be replaced with a von Mangoldt function, but (in contrast to the previous paper) we cannot treat the case when both factors are von Mangoldt.

As discussed in this previous post, one heuristically expects an asymptotic of the form

$\displaystyle \sum_{n \leq X} d_k(n) d_l(n+h) = P_{k,l,h}( \log X ) X + O( X^{1/2+\varepsilon})$

for any fixed ${\varepsilon>0}$, where ${P_{k,l,h}}$ is a certain explicit (but rather complicated) polynomial of degree ${k+l-1}$. Such asymptotics are known when ${l \leq 2}$, but remain open for ${k \geq l \geq 3}$. In the previous paper, we were able to obtain a weaker bound of the form

$\displaystyle \sum_{n \leq X} d_k(n) d_l(n+h) = P_{k,l,h}( \log X ) X + O_A( X \log^{-A} X)$

for ${1-O_A(\log^{-A} X)}$ of the shifts ${-H \leq h \leq H}$, whenever the shift range ${H}$ lies between ${X^{8/33+\varepsilon}}$ and ${X^{1-\varepsilon}}$. But the methods become increasingly hard to use as ${H}$ gets smaller. In this paper, we use a rather different method to obtain the even weaker bound

$\displaystyle \sum_{n \leq X} d_k(n) d_l(n+h) = (1+o(1)) P_{k,l,h}( \log X ) X$

for ${1-o(1)}$ of the shifts ${-H \leq h \leq H}$, where ${H}$ can now be as short as ${H = \log^{10^4 k \log k} X}$. The constant ${10^4}$ can be improved, but there are serious obstacles to using our method to go below ${\log^{k \log k} X}$ (as the exceptionally large values of ${d_k}$ then begin to dominate). This can be viewed as an analogue to our previous paper on correlations of bounded multiplicative functions on average, in which the functions ${d_k,d_l}$ are now unbounded, and indeed our proof strategy is based in large part on that paper (but with many significant new technical complications).

We now discuss some of the ingredients of the proof. Unsurprisingly, the first step is the circle method, expressing (1) in terms of exponential sums such as

$\displaystyle S(\alpha) := \sum_{n \leq X} d_k(n) e(\alpha).$

Actually, it is convenient to first prune ${d_k}$ slightly by zeroing out this function on “atypical” numbers ${n}$ that have an unusually small or large number of factors in a certain sense, but let us ignore this technicality for this discussion. The contribution of ${S(\alpha)}$ for “major arc” ${\alpha}$ can be treated by standard techniques (and is the source of the main term ${P_{k,l,h}(\log X) X}$; the main difficulty comes from treating the contribution of “minor arc” ${\alpha}$.

In our previous paper on bounded multiplicative functions, we used Plancherel’s theorem to estimate the global ${L^2}$ norm ${\int_{{\bf R}/{\bf Z}} |S(\alpha)|^2\ d\alpha}$, and then also used the Katai-Bourgain-Sarnak-Ziegler orthogonality criterion to control local ${L^2}$ norms ${\int_I |S(\alpha)|^2\ d\alpha}$, where ${I}$ was a minor arc interval of length about ${1/H}$, and these two estimates together were sufficient to get a good bound on correlations by an application of Hölder’s inequality. For ${d_k}$, it is more convenient to use Dirichlet series methods (and Ramaré-type factorisations of such Dirichlet series) to control local ${L^2}$ norms on minor arcs, in the spirit of the proof of the Matomaki-Radziwill theorem; a key point is to develop “log-free” mean value theorems for Dirichlet series associated to functions such as ${d_k}$, so as not to wipe out the (rather small) savings one will get over the trivial bound from this method. On the other hand, the global ${L^2}$ bound will definitely be unusable, because the ${\ell^2}$ sum ${\sum_{n \leq X} d_k(n)^2}$ has too many unwanted factors of ${\log X}$. Fortunately, we can substitute this global ${L^2}$ bound with a “large values” bound that controls expressions such as

$\displaystyle \sum_{i=1}^J \int_{I_i} |S(\alpha)|^2\ d\alpha$

for a moderate number of disjoint intervals ${I_1,\dots,I_J}$, with a bound that is slightly better (for ${J}$ a medium-sized power of ${\log X}$) than what one would have obtained by bounding each integral ${\int_{I_i} |S(\alpha)|^2\ d\alpha}$ separately. (One needs to save more than ${J^{1/2}}$ for the argument to work; we end up saving a factor of about ${J^{3/4}}$.) This large values estimate is probably the most novel contribution of the paper. After taking the Fourier transform, matters basically reduce to getting a good estimate for

$\displaystyle \sum_{i=1}^J (\int_X^{2X} |\sum_{x \leq n \leq x+H} d_k(n) e(\alpha_i n)|^2\ dx)^{1/2},$

where ${\alpha_i}$ is the midpoint of ${I_i}$; thus we need some upper bound on the large local Fourier coefficients of ${d_k}$. These coefficients are difficult to calculate directly, but, in the spirit of a paper of Ben Green and myself, we can try to replace ${d_k}$ by a more tractable and “pseudorandom” majorant ${\tilde d_k}$ for which the local Fourier coefficients are computable (on average). After a standard duality argument, one ends up having to control expressions such as

$\displaystyle |\sum_{x \leq n \leq x+H} \tilde d_k(n) e((\alpha_i -\alpha_{i'}) n)|$

after various averaging in the ${x, i,i'}$ parameters. These local Fourier coefficients of ${\tilde d_k}$ turn out to be small on average unless ${\alpha_i -\alpha_{i'}}$ is “major arc”. One then is left with a mostly combinatorial problem of trying to bound how often this major arc scenario occurs. This is very close to a computation in the previously mentioned paper of Ben and myself; there is a technical wrinkle in that the ${\alpha_i}$ are not as well separated as they were in my paper with Ben, but it turns out that one can modify the arguments in that paper to still obtain a satisfactory estimate in this case (after first grouping nearby frequencies ${\alpha_i}$ together, and modifying the duality argument accordingly).

In the tradition of “Polymath projects“, the problem posed in the previous two blog posts has now been solved, thanks to the cumulative effect of many small contributions by many participants (including, but not limited to, Sean Eberhard, Tobias Fritz, Siddharta Gadgil, Tobias Hartnick, Chris Jerdonek, Apoorva Khare, Antonio Machiavelo, Pace Nielsen, Andy Putman, Will Sawin, Alexander Shamov, Lior Silberman, and David Speyer). In this post I’ll write down a streamlined resolution, eliding a number of important but ultimately removable partial steps and insights made by the above contributors en route to the solution.

Theorem 1 Let ${G = (G,\cdot)}$ be a group. Suppose one has a “seminorm” function ${\| \|: G \rightarrow [0,+\infty)}$ which obeys the triangle inequality

$\displaystyle \|xy \| \leq \|x\| + \|y\|$

for all ${x,y \in G}$, with equality whenever ${x=y}$. Then the seminorm factors through the abelianisation map ${G \mapsto G/[G,G]}$.

Proof: By the triangle inequality, it suffices to show that ${\| [x,y]\| = 0}$ for all ${x,y \in G}$, where ${[x,y] := xyx^{-1}y^{-1}}$ is the commutator.

We first establish some basic facts. Firstly, by hypothesis we have ${\|x^2\| = 2 \|x\|}$, and hence ${\|x^n \| = n \|x\|}$ whenever ${n}$ is a power of two. On the other hand, by the triangle inequality we have ${\|x^n \| \leq n\|x\|}$ for all positive ${n}$, and hence by the triangle inequality again we also have the matching lower bound, thus

$\displaystyle \|x^n \| = n \|x\|$

for all ${n > 0}$. The claim is also true for ${n=0}$ (apply the preceding bound with ${x=1}$ and ${n=2}$). By replacing ${\|x\|}$ with ${\max(\|x\|, \|x^{-1}\|)}$ if necessary we may now also assume without loss of generality that ${\|x^{-1} \| = \|x\|}$, thus

$\displaystyle \|x^n \| = |n| \|x\| \ \ \ \ \ (1)$

for all integers ${n}$.

Next, for any ${x,y \in G}$, and any natural number ${n}$, we have

$\displaystyle \|yxy^{-1} \| = \frac{1}{n} \| (yxy^{-1})^n \|$

$\displaystyle = \frac{1}{n} \| y x^n y^{-1} \|$

$\displaystyle \leq \frac{1}{n} ( \|y\| + n \|x\| + \|y\|^{-1} )$

so on taking limits as ${n \rightarrow \infty}$ we have ${\|yxy^{-1} \| \leq \|x\|}$. Replacing ${x,y}$ by ${yxy^{-1},y^{-1}}$ gives the matching lower bound, thus we have the conjugation invariance

$\displaystyle \|yxy^{-1} \| = \|x\|. \ \ \ \ \ (2)$

Next, we observe that if ${x,y,z,w}$ are such that ${x}$ is conjugate to both ${wy}$ and ${zw^{-1}}$, then one has the inequality

$\displaystyle \|x\| \leq \frac{1}{2} ( \|y \| + \| z \| ). \ \ \ \ \ (3)$

Indeed, if we write ${x = swys^{-1} = t zw^{-1} t^{-1}}$ for some ${s,t \in G}$, then for any natural number ${n}$ one has

$\displaystyle \|x\| = \frac{1}{2n} \| x^n x^n \|$

$\displaystyle = \frac{1}{2n} \| swy \dots wy s^{-1}t zw^{-1} \dots zw^{-1} t^{-1} \|$

where the ${wy}$ and ${zw^{-1}}$ terms each appear ${n}$ times. From (2) we see that conjugation by ${w}$ does not affect the norm. Using this and the triangle inequality several times, we conclude that

$\displaystyle \|x\| \leq \frac{1}{2n} ( \|s\| + n \|y\| + \| s^{-1} t\| + n \|z\| + \|t^{-1} \| ),$

and the claim (3) follows by sending ${n \rightarrow \infty}$.

The following special case of (3) will be of particular interest. Let ${x,y \in G}$, and for any integers ${m,k}$, define the quantity

$\displaystyle f(m,k) := \| x^m [x,y]^k \|.$

Observe that ${x^m [x,y]^k}$ is conjugate to both ${x (x^{m-1} [x,y]^k)}$ and to ${(y^{-1} x^m [x,y]^{k-1} xy) x^{-1}}$, hence by (3) one has

$\displaystyle \| x^m [x,y]^k \| \leq \frac{1}{2} ( \| x^{m-1} [x,y]^k \| + \| y^{-1} x^{m} [x,y]^{k-1} xy \|)$

which by (2) leads to the recursive inequality

$\displaystyle f(m,k) \leq \frac{1}{2} (f(m-1,k) + f(m+1,k-1)).$

We can write this in probabilistic notation as

$\displaystyle f(m,k) \leq {\bf E} f( (m,k) + X )$

where ${X}$ is a random vector that takes the values ${(-1,0)}$ and ${(1,-1)}$ with probability ${1/2}$ each. Iterating this, we conclude in particular that for any large natural number ${n}$, one has

$\displaystyle f(0,n) \leq {\bf E} f( Z )$

where ${Z := (0,n) + X_1 + \dots + X_{2n}}$ and ${X_1,\dots,X_{2n}}$ are iid copies of ${X}$. We can write ${Z = (1,-1/2) (Y_1 + \dots + Y_{2n})}$ where $Y_1,\dots,Y_{2n} = \pm 1$ are iid signs.  By the triangle inequality, we thus have

$\displaystyle f( Z ) \leq |Y_1+\dots+Y_{2n}| (\|x\| + \frac{1}{2} \| [x,y] \|),$

noting that $Y_1+\dots+Y_{2n}$ is an even integer.  On the other hand, $Y_1+\dots+Y_{2n}$ has mean zero and variance $2n$, hence by Cauchy-Schwarz

$\displaystyle f(0,n) \leq \sqrt{2n}( \|x\| + \frac{1}{2} \| [x,y] \|).$

But by (1), the left-hand side is equal to ${n \| [x,y]\|}$. Dividing by ${n}$ and then sending ${n \rightarrow \infty}$, we obtain the claim. $\Box$

The above theorem reduces such seminorms to abelian groups. It is easy to see from (1) that any torsion element of such groups has zero seminorm, so we can in fact restrict to torsion-free groups, which we now write using additive notation ${G = (G,+)}$, thus for instance ${\| nx \| = |n| \|x\|}$ for ${n \in {\bf Z}}$. We think of ${G}$ as a ${{\bf Z}}$-module. One can then extend the seminorm to the associated ${{\bf Q}}$-vector space ${G \otimes_{\bf Z} {\bf Q}}$ by the formula ${\|\frac{a}{b} x\| := \frac{a}{b} \|x\|}$, and then to the associated ${{\bf R}}$-vector space ${G \otimes_{\bf Z} {\bf R}}$ by continuity, at which point it becomes a genuine seminorm (provided we have ensured the symmetry condition ${\|x\| = \|x^{-1}\|}$). Conversely, any seminorm on ${G \otimes_{\bf Z} {\bf R}}$ induces a seminorm on ${G}$. (These arguments also appear in this paper of Khare and Rajaratnam.)

This post is a continuation of the previous post, which has attracted a large number of comments. I’m recording here some calculations that arose from those comments (particularly those of Pace Nielsen, Lior Silberman, Tobias Fritz, and Apoorva Khare). Please feel free to either continue these calculations or to discuss other approaches to the problem, such as those mentioned in the remaining comments to the previous post.

Let ${F_2}$ be the free group on two generators ${a,b}$, and let ${\| \|: F_2 \rightarrow {\bf R}^+}$ be a quantity obeying the triangle inequality

$\displaystyle \| xy\| \leq \|x \| + \|y\|$

and the linear growth property

$\displaystyle \| x^n \| = |n| \| x\|$

for all ${x,y \in F_2}$ and integers ${n \in {\bf Z}}$; this implies the conjugation invariance

$\displaystyle \| y^{-1} x y \| = \|x\|$

or equivalently

$\displaystyle \| xy \| = \| yx\|$

We consider inequalities of the form

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \alpha \|x\| + \beta \| y\| \ \ \ \ \ (1)$

or

$\displaystyle \| xyx^{-2}y^{-1} \| \leq \gamma \|x\| + \delta \| y\| \ \ \ \ \ (2)$

for various real numbers ${\alpha,\beta,\gamma,\delta}$. For instance, since

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \| xyx^{-1}\| + \|y^{-1} \| = \|y\| + \|y\|$

we have (1) for ${(\alpha,\beta) = (2,0)}$. We also have the following further relations:

Proposition 1

• (i) If (1) holds for ${(\alpha,\beta)}$, then it holds for ${(\beta,\alpha)}$.
• (ii) If (1) holds for ${(\alpha,\beta)}$, then (2) holds for ${(\alpha+1, \frac{\beta}{2})}$.
• (iii) If (2) holds for ${(\gamma,\delta)}$, then (1) holds for ${(\frac{2\gamma}{3}, \frac{2\delta}{3})}$.
• (iv) If (1) holds for ${(\alpha,\beta)}$ and (2) holds for ${(\gamma,\delta)}$, then (1) holds for ${(\frac{2\alpha+1+\gamma}{4}, \frac{\delta+\beta}{4})}$.

Proof: For (i) we simply observe that

$\displaystyle \| xyx^{-1} y^{-1} \| = \| (xyx^{-1} y^{-1})^{-1} \| = \| y^{-1} x^{-1} y x \| = \| y x y^{-1} x^{-1} \|.$

For (ii), we calculate

$\displaystyle \| xyx^{-2}y^{-1} \| = \frac{1}{2}\| (xyx^{-2}y^{-1})^2 \|$

$\displaystyle = \frac{1}{2} \| (xyx^{-2}y^{-1} x) (yx^{-2} y^{-1}) \|$

$\displaystyle \leq \frac{1}{2} (\| xyx^{-2}y^{-1} x\| + \|yx^{-2} y^{-1}\|)$

$\displaystyle \leq \frac{1}{2} ( \| x^2 y x^{-2} y^{-1} \| + 2 \|x\| )$

$\displaystyle \leq \frac{1}{2} ( 2 \alpha \|x\| + \beta \|y\| + 2 \|x\|)$

giving the claim.

For (iii), we calculate

$\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{3} \| (xyx^{-1}y^{-1})^3 \|$

$\displaystyle = \frac{1}{3} \| (xyx) (x^{-2} y^{-1} xy) (xyx)^{-1} (x^2 y x^{-1} y^{-1}) \|$

$\displaystyle \leq \frac{1}{3} ( \| x^{-2} y^{-1} xy\| + \| x^2 y x^{-1} y^{-1}\| )$

$\displaystyle = \frac{1}{3} ( \| xy x^{-2} y^{-1} \| + \|x^{-1} y^{-1} x^2 y \| )$

$\displaystyle \leq \frac{1}{3} ( \gamma \|x\| + \delta \|y\| + \gamma \|x\| + \delta \|y\|)$

giving the claim.

For (iv), we calculate

$\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{4} \| (xyx^{-1}y^{-1})^4 \|$

$\displaystyle = \frac{1}{4} \| (xy) (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) (xy)^{-1} (x^2yx^{-1}y^{-1}) \|$

$\displaystyle \leq \frac{1}{4} ( \| (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) \| + \|x^2yx^{-1}y^{-1}\| )$

$\displaystyle \leq \frac{1}{4} ( \|(y x^{-1} y^{-1}) (xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)$

$\displaystyle \leq \frac{1}{4} ( \|x\| + \|(xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)$

$\displaystyle = \frac{1}{4} ( \|x\| + \|x^{-2} y x^2 y^{-1} \|+ \gamma \|x\| + \delta \|y\|)$

$\displaystyle \leq \frac{1}{4} ( \|x\| + 2\alpha \|x\| + \beta \|y\| + \gamma \|x\| + \delta \|y\|)$

giving the claim. $\Box$

Here is a typical application of the above estimates. If (1) holds for ${(\alpha,\beta)}$, then by part (i) it holds for ${(\beta,\alpha)}$, then by (ii) (2) holds for ${(\beta+1,\frac{\alpha}{2})}$, then by (iv) (1) holds for ${(\frac{3\beta+2}{4}, \frac{3\alpha}{8})}$. The map ${(\alpha,\beta) \mapsto (\frac{3\beta+2}{4}, \frac{3\alpha}{8})}$ has fixed point ${(\alpha,\beta) = (\frac{16}{23}, \frac{6}{23})}$, thus

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \frac{16}{23} \|x\| + \frac{6}{23} \|y\|.$

For instance, if ${\|a\|, \|b\| \leq 1}$, then ${\|aba^{-1}b^{-1} \| \leq 22/23 = 0.95652\dots}$.

Here is a curious question posed to me by Apoorva Khare that I do not know the answer to. Let ${F_2}$ be the free group on two generators ${a,b}$. Does there exist a metric ${d}$ on this group which is

• bi-invariant, thus ${d(xg,yg)=d(gx,gy) = d(x,y)}$ for all ${x,y,g \in F_2}$; and
• linear growth in the sense that ${d(x^n,1) = n d(x,1)}$ for all ${x \in F_2}$ and all natural numbers ${n}$?

By defining the “norm” of an element ${x \in F_2}$ to be ${\| x\| := d(x,1)}$, an equivalent formulation of the problem asks if there exists a non-negative norm function ${\| \|: F_2 \rightarrow {\bf R}^+}$ that obeys the conjugation invariance

$\displaystyle \| gxg^{-1} \| = \|x \| \ \ \ \ \ (1)$

for all ${x,g \in F_2}$, the triangle inequality

$\displaystyle \| xy \| \leq \| x\| + \| y\| \ \ \ \ \ (2)$

for all ${x,y \in F_2}$, and the linear growth

$\displaystyle \| x^n \| = |n| \|x\| \ \ \ \ \ (3)$

for all ${x \in F_2}$ and ${n \in {\bf Z}}$, and such that ${\|x\| > 0}$ for all non-identity ${x \in F_2}$. Indeed, if such a norm exists then one can just take ${d(x,y) := \| x y^{-1} \|}$ to give the desired metric.

One can normalise the norm of the generators to be at most ${1}$, thus

$\displaystyle \| a \|, \| b \| \leq 1.$

This can then be used to upper bound the norm of other words in ${F_2}$. For instance, from (1), (3) one has

$\displaystyle \| aba^{-1} \|, \| b^{-1} a b \|, \| a^{-1} b^{-1} a \|, \| bab^{-1}\| \leq 1.$

A bit less trivially, from (3), (2), (1) one can bound commutators as

$\displaystyle \| aba^{-1} b^{-1} \| = \frac{1}{3} \| (aba^{-1} b^{-1})^3 \|$

$\displaystyle = \frac{1}{3} \| (aba^{-1}) (b^{-1} ab) (a^{-1} b^{-1} a) (b ab^{-1}) \|$

$\displaystyle \leq \frac{4}{3}.$

In a similar spirit one has

$\displaystyle \| aba^{-2} b^{-1} \| = \frac{1}{2} \| (aba^{-2} b^{-1})^2 \|$

$\displaystyle = \frac{1}{2} \| (aba^{-1}) (a^{-1} b^{-1} a) (ba^{-1} b^{-1}) (ba^{-1} b^{-1}) \|$

$\displaystyle \leq 2.$

What is not clear to me is if one can keep arguing like this to continually improve the upper bounds on the norm ${\| g\|}$ of a given non-trivial group element ${g}$ to the point where this norm must in fact vanish, which would demonstrate that no metric with the above properties on ${F_2}$ would exist (and in fact would impose strong constraints on similar metrics existing on other groups as well). It is also tempting to use some ideas from geometric group theory (e.g. asymptotic cones) to try to understand these metrics further, though I wasn’t able to get very far with this approach. Anyway, this feels like a problem that might be somewhat receptive to a more crowdsourced attack, so I am posing it here in case any readers wish to try to make progress on it.

The Boussinesq equations for inviscid, incompressible two-dimensional fluid flow in the presence of gravity are given by

$\displaystyle (\partial_t + u_x \partial_x+ u_y \partial_y) u_x = -\partial_x p \ \ \ \ \ (1)$

$\displaystyle (\partial_t + u_x \partial_x+ u_y \partial_y) u_y = \rho - \partial_y p \ \ \ \ \ (2)$

$\displaystyle (\partial_t + u_x \partial_x+ u_y \partial_y) \rho = 0 \ \ \ \ \ (3)$

$\displaystyle \partial_x u_x + \partial_y u_y = 0 \ \ \ \ \ (4)$

where ${u: {\bf R} \times {\bf R}^2 \rightarrow {\bf R}^2}$ is the velocity field, ${p: {\bf R} \times {\bf R}^2 \rightarrow {\bf R}}$ is the pressure field, and ${\rho: {\bf R} \times {\bf R}^2 \rightarrow {\bf R}}$ is the density field (or, in some physical interpretations, the temperature field). In this post we shall restrict ourselves to formal manipulations, assuming implicitly that all fields are regular enough (or sufficiently decaying at spatial infinity) that the manipulations are justified. Using the material derivative ${D_t := \partial_t + u_x \partial_x + u_y \partial_y}$, one can abbreviate these equations as

$\displaystyle D_t u_x = -\partial_x p$

$\displaystyle D_t u_y = \rho - \partial_y p$

$\displaystyle D_t \rho = 0$

$\displaystyle \partial_x u_x + \partial_y u_y = 0.$

One can eliminate the role of the pressure ${p}$ by working with the vorticity ${\omega := \partial_x u_y - \partial_y u_x}$. A standard calculation then leads us to the equivalent “vorticity-stream” formulation

$\displaystyle D_t \omega = \partial_x \rho$

$\displaystyle D_t \rho = 0$

$\displaystyle \omega = \partial_x u_y - \partial_y u_x$

$\displaystyle \partial_x u_y + \partial_y u_y = 0$

of the Boussinesq equations. The latter two equations can be used to recover the velocity field ${u}$ from the vorticity ${\omega}$ by the Biot-Savart law

$\displaystyle u_x := -\partial_y \Delta^{-1} \omega; \quad u_y = \partial_x \Delta^{-1} \omega.$

It has long been observed (see e.g. Section 5.4.1 of Bertozzi-Majda) that the Boussinesq equations are very similar, though not quite identical, to the three-dimensional inviscid incompressible Euler equations under the hypothesis of axial symmetry (with swirl). The Euler equations are

$\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p$

$\displaystyle \nabla \cdot u = 0$

where now the velocity field ${u: {\bf R} \times {\bf R}^3 \rightarrow {\bf R}^3}$ and pressure field ${p: {\bf R} \times {\bf R}^3 \rightarrow {\bf R}}$ are over the three-dimensional domain ${{\bf R}^3}$. If one expresses ${{\bf R}^3}$ in polar coordinates ${(z,r,\theta)}$ then one can write the velocity vector field ${u}$ in these coordinates as

$\displaystyle u = u^z \frac{d}{dz} + u^r \frac{d}{dr} + u^\theta \frac{d}{d\theta}.$

If we make the axial symmetry assumption that these components, as well as ${p}$, do not depend on the ${\theta}$ variable, thus

$\displaystyle \partial_\theta u^z, \partial_\theta u^r, \partial_\theta u^\theta, \partial_\theta p = 0,$

then after some calculation (which we give below the fold) one can eventually reduce the Euler equations to the system

$\displaystyle \tilde D_t \omega = \frac{1}{r^4} \partial_z \rho \ \ \ \ \ (5)$

$\displaystyle \tilde D_t \rho = 0 \ \ \ \ \ (6)$

$\displaystyle \omega = \frac{1}{r} (\partial_z u^r - \partial_r u^z) \ \ \ \ \ (7)$

$\displaystyle \partial_z(ru^z) + \partial_r(ru^r) = 0 \ \ \ \ \ (8)$

where ${\tilde D_t := \partial_t + u^z \partial_z + u^r \partial_r}$ is the modified material derivative, and ${\rho}$ is the field ${\rho := (r u^\theta)^2}$. This is almost identical with the Boussinesq equations except for some additional powers of ${r}$; thus, the intuition is that the Boussinesq equations are a simplified model for axially symmetric Euler flows when one stays away from the axis ${r=0}$ and also does not wander off to ${r=\infty}$.

However, this heuristic is not rigorous; the above calculations do not actually give an embedding of the Boussinesq equations into Euler. (The equations do match on the cylinder ${r=1}$, but this is a measure zero subset of the domain, and so is not enough to give an embedding on any non-trivial region of space.) Recently, while playing around with trying to embed other equations into the Euler equations, I discovered that it is possible to make such an embedding into a four-dimensional Euler equation, albeit on a slightly curved manifold rather than in Euclidean space. More precisely, we use the Ebin-Marsden generalisation

$\displaystyle \partial_t u + \nabla_u u = - \mathrm{grad}_g p$

$\displaystyle \mathrm{div}_g u = 0$

of the Euler equations to an arbitrary Riemannian manifold ${(M,g)}$ (ignoring any issues of boundary conditions for this discussion), where ${u: {\bf R} \rightarrow \Gamma(TM)}$ is a time-dependent vector field, ${p: {\bf R} \rightarrow C^\infty(M)}$ is a time-dependent scalar field, and ${\nabla_u}$ is the covariant derivative along ${u}$ using the Levi-Civita connection ${\nabla}$. In Penrose abstract index notation (using the Levi-Civita connection ${\nabla}$, and raising and lowering indices using the metric ${g = g_{ij}}$), the equations of motion become

$\displaystyle \partial_t u^i + u^j \nabla_j u^i = - \nabla^i p \ \ \ \ \ (9)$

$\displaystyle \nabla_i u^i = 0;$

in coordinates, this becomes

$\displaystyle \partial_t u^i + u^j (\partial_j u^i + \Gamma^i_{jk} u^k) = - g^{ij} \partial_j p$

$\displaystyle \partial_i u^i + \Gamma^i_{ik} u^k = 0 \ \ \ \ \ (10)$

where the Christoffel symbols ${\Gamma^i_{jk}}$ are given by the formula

$\displaystyle \Gamma^i_{jk} := \frac{1}{2} g^{il} (\partial_j g_{lk} + \partial_k g_{lj} - \partial_l g_{jk}),$

where ${g^{il}}$ is the inverse to the metric tensor ${g_{il}}$. If the coordinates are chosen so that the volume form ${dg}$ is the Euclidean volume form ${dx}$, thus ${\mathrm{det}(g)=1}$, then on differentiating we have ${g^{ij} \partial_k g_{ij} = 0}$, and hence ${\Gamma^i_{ik} = 0}$, and so the divergence-free equation (10) simplifies in this case to ${\partial_i u^i = 0}$. The Ebin-Marsden Euler equations are the natural generalisation of the Euler equations to arbitrary manifolds; for instance, they (formally) conserve the kinetic energy

$\displaystyle \frac{1}{2} \int_M |u|_g^2\ dg = \frac{1}{2} \int_M g_{ij} u^i u^j\ dg$

and can be viewed as the formal geodesic flow equation on the infinite-dimensional manifold of volume-preserving diffeomorphisms on ${M}$ (see this previous post for a discussion of this in the flat space case).

The specific four-dimensional manifold in question is the space ${{\bf R} \times {\bf R}^+ \times {\bf R}/{\bf Z} \times {\bf R}/{\bf Z}}$ with metric

$\displaystyle dx^2 + dy^2 + y^{-1} dz^2 + y dw^2$

and solutions to the Boussinesq equation on ${{\bf R} \times {\bf R}^+}$ can be transformed into solutions to the Euler equations on this manifold. This is part of a more general family of embeddings into the Euler equations in which passive scalar fields (such as the field ${\rho}$ appearing in the Boussinesq equations) can be incorporated into the dynamics via fluctuations in the Riemannian metric ${g}$). I am writing the details below the fold (partly for my own benefit).