You are currently browsing the category archive for the ‘math.AP’ category.

The Poincaré upper half-plane ${{\mathbf H} := \{ z: \hbox{Im}(z) > 0 \}}$ (with a boundary consisting of the real line ${{\bf R}}$ together with the point at infinity ${\infty}$) carries an action of the projective special linear group

$\displaystyle \hbox{PSL}_2({\bf R}) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix}: a,b,c,d \in {\bf R}: ad-bc = 1 \} / \{\pm 1\}$

via fractional linear transformations:

$\displaystyle \begin{pmatrix} a & b \\ c & d \end{pmatrix} z := \frac{az+b}{cz+d}. \ \ \ \ \ (1)$

Here and in the rest of the post we will abuse notation by identifying elements ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ of the special linear group ${\hbox{SL}_2({\bf R})}$ with their equivalence class ${\{ \pm \begin{pmatrix} a & b \\ c & d \end{pmatrix} \}}$ in ${\hbox{PSL}_2({\bf R})}$; this will occasionally create or remove a factor of two in our formulae, but otherwise has very little effect, though one has to check that various definitions and expressions (such as (1)) are unaffected if one replaces a matrix ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ by its negation ${\begin{pmatrix} -a & -b \\ -c & -d \end{pmatrix}}$. In particular, we recommend that the reader ignore the signs ${\pm}$ that appear from time to time in the discussion below.

As the action of ${\hbox{PSL}_2({\bf R})}$ on ${{\mathbf H}}$ is transitive, and any given point in ${{\mathbf H}}$ (e.g. ${i}$) has a stabiliser isomorphic to the projective rotation group ${\hbox{PSO}_2({\bf R})}$, we can view the Poincaré upper half-plane ${{\mathbf H}}$ as a homogeneous space for ${\hbox{PSL}_2({\bf R})}$, and more specifically the quotient space of ${\hbox{PSL}_2({\bf R})}$ of a maximal compact subgroup ${\hbox{PSO}_2({\bf R})}$. In fact, we can make the half-plane a symmetric space for ${\hbox{PSL}_2({\bf R})}$, by endowing ${{\mathbf H}}$ with the Riemannian metric

$\displaystyle dg^2 := \frac{dx^2 + dy^2}{y^2}$

(using Cartesian coordinates ${z=x+iy}$), which is invariant with respect to the ${\hbox{PSL}_2({\bf R})}$ action. Like any other Riemannian metric, the metric on ${{\mathbf H}}$ generates a number of other important geometric objects on ${{\mathbf H}}$, such as the distance function ${d(z,w)}$ which can be computed to be given by the formula

$\displaystyle 2(\cosh(d(z_1,z_2))-1) = \frac{|z_1-z_2|^2}{\hbox{Im}(z_1) \hbox{Im}(z_2)}, \ \ \ \ \ (2)$

the volume measure ${\mu = \mu_{\mathbf H}}$, which can be computed to be

$\displaystyle d\mu = \frac{dx dy}{y^2},$

and the Laplace-Beltrami operator, which can be computed to be ${\Delta = y^2 (\frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2})}$ (here we use the negative definite sign convention for ${\Delta}$). As the metric ${dg}$ was ${\hbox{PSL}_2({\bf R})}$-invariant, all of these quantities arising from the metric are similarly ${\hbox{PSL}_2({\bf R})}$-invariant in the appropriate sense.

The Gauss curvature of the Poincaré half-plane can be computed to be the constant ${-1}$, thus ${{\mathbf H}}$ is a model for two-dimensional hyperbolic geometry, in much the same way that the unit sphere ${S^2}$ in ${{\bf R}^3}$ is a model for two-dimensional spherical geometry (or ${{\bf R}^2}$ is a model for two-dimensional Euclidean geometry). (Indeed, ${{\mathbf H}}$ is isomorphic (via projection to a null hyperplane) to the upper unit hyperboloid ${\{ (x,t) \in {\bf R}^{2+1}: t = \sqrt{1+|x|^2}\}}$ in the Minkowski spacetime ${{\bf R}^{2+1}}$, which is the direct analogue of the unit sphere in Euclidean spacetime ${{\bf R}^3}$ or the plane ${{\bf R}^2}$ in Galilean spacetime ${{\bf R}^2 \times {\bf R}}$.)

One can inject arithmetic into this geometric structure by passing from the Lie group ${\hbox{PSL}_2({\bf R})}$ to the full modular group

$\displaystyle \hbox{PSL}_2({\bf Z}) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix}: a,b,c,d \in {\bf Z}: ad-bc = 1 \} / \{\pm 1\}$

or congruence subgroups such as

$\displaystyle \Gamma_0(q) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix} \in \hbox{PSL}_2({\bf Z}): c = 0\ (q) \} / \{ \pm 1 \} \ \ \ \ \ (3)$

for natural number ${q}$, or to the discrete stabiliser ${\Gamma_\infty}$ of the point at infinity:

$\displaystyle \Gamma_\infty := \{ \pm \begin{pmatrix} 1 & b \\ 0 & 1 \end{pmatrix}: b \in {\bf Z} \} / \{\pm 1\}. \ \ \ \ \ (4)$

These are discrete subgroups of ${\hbox{PSL}_2({\bf R})}$, nested by the subgroup inclusions

$\displaystyle \Gamma_\infty \leq \Gamma_0(q) \leq \Gamma_0(1)=\hbox{PSL}_2({\bf Z}) \leq \hbox{PSL}_2({\bf R}).$

There are many further discrete subgroups of ${\hbox{PSL}_2({\bf R})}$ (known collectively as Fuchsian groups) that one could consider, but we will focus attention on these three groups in this post.

Any discrete subgroup ${\Gamma}$ of ${\hbox{PSL}_2({\bf R})}$ generates a quotient space ${\Gamma \backslash {\mathbf H}}$, which in general will be a non-compact two-dimensional orbifold. One can understand such a quotient space by working with a fundamental domain ${\hbox{Fund}( \Gamma \backslash {\mathbf H})}$ – a set consisting of a single representative of each of the orbits ${\Gamma z}$ of ${\Gamma}$ in ${{\mathbf H}}$. This fundamental domain is by no means uniquely defined, but if the fundamental domain is chosen with some reasonable amount of regularity, one can view ${\Gamma \backslash {\mathbf H}}$ as the fundamental domain with the boundaries glued together in an appropriate sense. Among other things, fundamental domains can be used to induce a volume measure ${\mu = \mu_{\Gamma \backslash {\mathbf H}}}$ on ${\Gamma \backslash {\mathbf H}}$ from the volume measure ${\mu = \mu_{\mathbf H}}$ on ${{\mathbf H}}$ (restricted to a fundamental domain). By abuse of notation we will refer to both measures simply as ${\mu}$ when there is no chance of confusion.

For instance, a fundamental domain for ${\Gamma_\infty \backslash {\mathbf H}}$ is given (up to null sets) by the strip ${\{ z \in {\mathbf H}: |\hbox{Re}(z)| < \frac{1}{2} \}}$, with ${\Gamma_\infty \backslash {\mathbf H}}$ identifiable with the cylinder formed by gluing together the two sides of the strip. A fundamental domain for ${\hbox{PSL}_2({\bf Z}) \backslash {\mathbf H}}$ is famously given (again up to null sets) by an upper portion ${\{ z \in {\mathbf H}: |\hbox{Re}(z)| < \frac{1}{2}; |z| > 1 \}}$, with the left and right sides again glued to each other, and the left and right halves of the circular boundary glued to itself. A fundamental domain for ${\Gamma_0(q) \backslash {\mathbf H}}$ can be formed by gluing together

$\displaystyle [\hbox{PSL}_2({\bf Z}) : \Gamma_0(q)] = q \prod_{p|q} (1 + \frac{1}{p}) = q^{1+o(1)}$

copies of a fundamental domain for ${\hbox{PSL}_2({\bf Z}) \backslash {\mathbf H}}$ in a rather complicated but interesting fashion.

While fundamental domains can be a convenient choice of coordinates to work with for some computations (as well as for drawing appropriate pictures), it is geometrically more natural to avoid working explicitly on such domains, and instead work directly on the quotient spaces ${\Gamma \backslash {\mathbf H}}$. In order to analyse functions ${f: \Gamma \backslash {\mathbf H} \rightarrow {\bf C}}$ on such orbifolds, it is convenient to lift such functions back up to ${{\mathbf H}}$ and identify them with functions ${f: {\mathbf H} \rightarrow {\bf C}}$ which are ${\Gamma}$-automorphic in the sense that ${f( \gamma z ) = f(z)}$ for all ${z \in {\mathbf H}}$ and ${\gamma \in \Gamma}$. Such functions will be referred to as ${\Gamma}$-automorphic forms, or automorphic forms for short (we always implicitly assume all such functions to be measurable). (Strictly speaking, these are the automorphic forms with trivial factor of automorphy; one can certainly consider other factors of automorphy, particularly when working with holomorphic modular forms, which corresponds to sections of a more non-trivial line bundle over ${\Gamma \backslash {\mathbf H}}$ than the trivial bundle ${(\Gamma \backslash {\mathbf H}) \times {\bf C}}$ that is implicitly present when analysing scalar functions ${f: {\mathbf H} \rightarrow {\bf C}}$. However, we will not discuss this (important) more general situation here.)

An important way to create a ${\Gamma}$-automorphic form is to start with a non-automorphic function ${f: {\mathbf H} \rightarrow {\bf C}}$ obeying suitable decay conditions (e.g. bounded with compact support will suffice) and form the Poincaré series ${P_\Gamma[f]: {\mathbf H} \rightarrow {\bf C}}$ defined by

$\displaystyle P_{\Gamma}[f](z) = \sum_{\gamma \in \Gamma} f(\gamma z),$

which is clearly ${\Gamma}$-automorphic. (One could equivalently write ${f(\gamma^{-1} z)}$ in place of ${f(\gamma z)}$ here; there are good argument for both conventions, but I have ultimately decided to use the ${f(\gamma z)}$ convention, which makes explicit computations a little neater at the cost of making the group actions work in the opposite order.) Thus we naturally see sums over ${\Gamma}$ associated with ${\Gamma}$-automorphic forms. A little more generally, given a subgroup ${\Gamma_\infty}$ of ${\Gamma}$ and a ${\Gamma_\infty}$-automorphic function ${f: {\mathbf H} \rightarrow {\bf C}}$ of suitable decay, we can form a relative Poincaré series ${P_{\Gamma_\infty \backslash \Gamma}[f]: {\mathbf H} \rightarrow {\bf C}}$ by

$\displaystyle P_{\Gamma_\infty \backslash \Gamma}[f](z) = \sum_{\gamma \in \hbox{Fund}(\Gamma_\infty \backslash \Gamma)} f(\gamma z)$

where ${\hbox{Fund}(\Gamma_\infty \backslash \Gamma)}$ is any fundamental domain for ${\Gamma_\infty \backslash \Gamma}$, that is to say a subset of ${\Gamma}$ consisting of exactly one representative for each right coset of ${\Gamma_\infty}$. As ${f}$ is ${\Gamma_\infty}$-automorphic, we see (if ${f}$ has suitable decay) that ${P_{\Gamma_\infty \backslash \Gamma}[f]}$ does not depend on the precise choice of fundamental domain, and is ${\Gamma}$-automorphic. These operations are all compatible with each other, for instance ${P_\Gamma = P_{\Gamma_\infty \backslash \Gamma} \circ P_{\Gamma_\infty}}$. A key example of Poincaré series are the Eisenstein series, although there are of course many other Poincaré series one can consider by varying the test function ${f}$.

For future reference we record the basic but fundamental unfolding identities

$\displaystyle \int_{\Gamma \backslash {\mathbf H}} P_\Gamma[f] g\ d\mu_{\Gamma \backslash {\mathbf H}} = \int_{\mathbf H} f g\ d\mu_{\mathbf H} \ \ \ \ \ (5)$

for any function ${f: {\mathbf H} \rightarrow {\bf C}}$ with sufficient decay, and any ${\Gamma}$-automorphic function ${g}$ of reasonable growth (e.g. ${f}$ bounded and compact support, and ${g}$ bounded, will suffice). Note that ${g}$ is viewed as a function on ${\Gamma \backslash {\mathbf H}}$ on the left-hand side, and as a ${\Gamma}$-automorphic function on ${{\mathbf H}}$ on the right-hand side. More generally, one has

$\displaystyle \int_{\Gamma \backslash {\mathbf H}} P_{\Gamma_\infty \backslash \Gamma}[f] g\ d\mu_{\Gamma \backslash {\mathbf H}} = \int_{\Gamma_\infty \backslash {\mathbf H}} f g\ d\mu_{\Gamma_\infty \backslash {\mathbf H}} \ \ \ \ \ (6)$

whenever ${\Gamma_\infty \leq \Gamma}$ are discrete subgroups of ${\hbox{PSL}_2({\bf R})}$, ${f}$ is a ${\Gamma_\infty}$-automorphic function with sufficient decay on ${\Gamma_\infty \backslash {\mathbf H}}$, and ${g}$ is a ${\Gamma}$-automorphic (and thus also ${\Gamma_\infty}$-automorphic) function of reasonable growth. These identities will allow us to move fairly freely between the three domains ${{\mathbf H}}$, ${\Gamma_\infty \backslash {\mathbf H}}$, and ${\Gamma \backslash {\mathbf H}}$ in our analysis.

When computing various statistics of a Poincaré series ${P_\Gamma[f]}$, such as its values ${P_\Gamma[f](z)}$ at special points ${z}$, or the ${L^2}$ quantity ${\int_{\Gamma \backslash {\mathbf H}} |P_\Gamma[f]|^2\ d\mu}$, expressions of interest to analytic number theory naturally emerge. We list three basic examples of this below, discussed somewhat informally in order to highlight the main ideas rather than the technical details.

The first example we will give concerns the problem of estimating the sum

$\displaystyle \sum_{n \leq x} \tau(n) \tau(n+1), \ \ \ \ \ (7)$

where ${\tau(n) := \sum_{d|n} 1}$ is the divisor function. This can be rewritten (by factoring ${n=bc}$ and ${n+1=ad}$) as

$\displaystyle \sum_{ a,b,c,d \in {\bf N}: ad-bc = 1} 1_{bc \leq x} \ \ \ \ \ (8)$

which is basically a sum over the full modular group ${\hbox{PSL}_2({\bf Z})}$. At this point we will “cheat” a little by moving to the related, but different, sum

$\displaystyle \sum_{a,b,c,d \in {\bf Z}: ad-bc = 1} 1_{a^2+b^2+c^2+d^2 \leq x}. \ \ \ \ \ (9)$

This sum is not exactly the same as (8), but will be a little easier to handle, and it is plausible that the methods used to handle this sum can be modified to handle (8). Observe from (2) and some calculation that the distance between ${i}$ and ${\begin{pmatrix} a & b \\ c & d \end{pmatrix} i = \frac{ai+b}{ci+d}}$ is given by the formula

$\displaystyle 2(\cosh(d(i,\begin{pmatrix} a & b \\ c & d \end{pmatrix} i))-1) = a^2+b^2+c^2+d^2 - 2$

and so one can express the above sum as

$\displaystyle 2 \sum_{\gamma \in \hbox{PSL}_2({\bf Z})} 1_{d(i,\gamma i) \leq \hbox{cosh}^{-1}(x/2)}$

(the factor of ${2}$ coming from the quotient by ${\{\pm 1\}}$ in the projective special linear group); one can express this as ${P_\Gamma[f](i)}$, where ${\Gamma = \hbox{PSL}_2({\bf Z})}$ and ${f}$ is the indicator function of the ball ${B(i, \hbox{cosh}^{-1}(x/2))}$. Thus we see that expressions such as (7) are related to evaluations of Poincaré series. (In practice, it is much better to use smoothed out versions of indicator functions in order to obtain good control on sums such as (7) or (9), but we gloss over this technical detail here.)

The second example concerns the relative

$\displaystyle \sum_{n \leq x} \tau(n^2+1) \ \ \ \ \ (10)$

of the sum (7). Note from multiplicativity that (7) can be written as ${\sum_{n \leq x} \tau(n^2+n)}$, which is superficially very similar to (10), but with the key difference that the polynomial ${n^2+1}$ is irreducible over the integers.

As with (7), we may expand (10) as

$\displaystyle \sum_{A,B,C \in {\bf N}: B^2 - AC = -1} 1_{B \leq x}.$

At first glance this does not look like a sum over a modular group, but one can manipulate this expression into such a form in one of two (closely related) ways. First, observe that any factorisation ${B + i = (a-bi) (c+di)}$ of ${B+i}$ into Gaussian integers ${a-bi, c+di}$ gives rise (upon taking norms) to an identity of the form ${B^2 - AC = -1}$, where ${A = a^2+b^2}$ and ${C = c^2+d^2}$. Conversely, by using the unique factorisation of the Gaussian integers, every identity of the form ${B^2-AC=-1}$ gives rise to a factorisation of the form ${B+i = (a-bi) (c+di)}$, essentially uniquely up to units. Now note that ${(a-bi)(c+di)}$ is of the form ${B+i}$ if and only if ${ad-bc=1}$, in which case ${B = ac+bd}$. Thus we can essentially write the above sum as something like

$\displaystyle \sum_{a,b,c,d: ad-bc = 1} 1_{|ac+bd| \leq x} \ \ \ \ \ (11)$

and one the modular group ${\hbox{PSL}_2({\bf Z})}$ is now manifest. An equivalent way to see these manipulations is as follows. A triple ${A,B,C}$ of natural numbers with ${B^2-AC=1}$ gives rise to a positive quadratic form ${Ax^2+2Bxy+Cy^2}$ of normalised discriminant ${B^2-AC}$ equal to ${-1}$ with integer coefficients (it is natural here to allow ${B}$ to take integer values rather than just natural number values by essentially doubling the sum). The group ${\hbox{PSL}_2({\bf Z})}$ acts on the space of such quadratic forms in a natural fashion (by composing the quadratic form with the inverse ${\begin{pmatrix} d & -b \\ -c & a \end{pmatrix}}$ of an element ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ of ${\hbox{SL}_2({\bf Z})}$). Because the discriminant ${-1}$ has class number one (this fact is equivalent to the unique factorisation of the gaussian integers, as discussed in this previous post), every form ${Ax^2 + 2Bxy + Cy^2}$ in this space is equivalent (under the action of some element of ${\hbox{PSL}_2({\bf Z})}$) with the standard quadratic form ${x^2+y^2}$. In other words, one has

$\displaystyle Ax^2 + 2Bxy + Cy^2 = (dx-by)^2 + (-cx+ay)^2$

which (up to a harmless sign) is exactly the representation ${B = ac+bd}$, ${A = c^2+d^2}$, ${C = a^2+b^2}$ introduced earlier, and leads to the same reformulation of the sum (10) in terms of expressions like (11). Similar considerations also apply if the quadratic polynomial ${n^2+1}$ is replaced by another quadratic, although one has to account for the fact that the class number may now exceed one (so that unique factorisation in the associated quadratic ring of integers breaks down), and in the positive discriminant case the fact that the group of units might be infinite presents another significant technical problem.

Note that ${\begin{pmatrix} a & b \\ c & d \end{pmatrix} i = \frac{ai+b}{ci+d}}$ has real part ${\frac{ac+bd}{c^2+d^2}}$ and imaginary part ${\frac{1}{c^2+d^2}}$. Thus (11) is (up to a factor of two) the Poincaré series ${P_\Gamma[f](i)}$ as in the preceding example, except that ${f}$ is now the indicator of the sector ${\{ z: |\hbox{Re} z| \leq x |\hbox{Im} z| \}}$.

Sums involving subgroups of the full modular group, such as ${\Gamma_0(q)}$, often arise when imposing congruence conditions on sums such as (10), for instance when trying to estimate the expression ${\sum_{n \leq x: q|n} \tau(n^2+1)}$ when ${q}$ and ${x}$ are large. As before, one then soon arrives at the problem of evaluating a Poincaré series at one or more special points, where the series is now over ${\Gamma_0(q)}$ rather than ${\hbox{PSL}_2({\bf Z})}$.

The third and final example concerns averages of Kloosterman sums

$\displaystyle S(m,n;c) := \sum_{x \in ({\bf Z}/c{\bf Z})^\times} e( \frac{mx + n\overline{x}}{c} ) \ \ \ \ \ (12)$

where ${e(\theta) := e^{2p\i i\theta}}$ and ${\overline{x}}$ is the inverse of ${x}$ in the multiplicative group ${({\bf Z}/c{\bf Z})^\times}$. It turns out that the ${L^2}$ norms of Poincaré series ${P_\Gamma[f]}$ or ${P_{\Gamma_\infty \backslash \Gamma}[f]}$ are closely tied to such averages. Consider for instance the quantity

$\displaystyle \int_{\Gamma_0(q) \backslash {\mathbf H}} |P_{\Gamma_\infty \backslash \Gamma_0(q)}[f]|^2\ d\mu_{\Gamma \backslash {\mathbf H}} \ \ \ \ \ (13)$

where ${q}$ is a natural number and ${f}$ is a ${\Gamma_\infty}$-automorphic form that is of the form

$\displaystyle f(x+iy) = F(my) e(m x)$

for some integer ${m}$ and some test function ${f: (0,+\infty) \rightarrow {\bf C}}$, which for sake of discussion we will take to be smooth and compactly supported. Using the unfolding formula (6), we may rewrite (13) as

$\displaystyle \int_{\Gamma_\infty \backslash {\mathbf H}} \overline{f} P_{\Gamma_\infty \backslash \Gamma_0(q)}[f]\ d\mu_{\Gamma_\infty \backslash {\mathbf H}}.$

To compute this, we use the double coset decomposition

$\displaystyle \Gamma_0(q) = \Gamma_\infty \cup \bigcup_{c \in {\mathbf N}: q|c} \bigcup_{1 \leq d \leq c: (d,c)=1} \Gamma_\infty \begin{pmatrix} a & b \\ c & d \end{pmatrix} \Gamma_\infty,$

where for each ${c,d}$, ${a,b}$ are arbitrarily chosen integers such that ${ad-bc=1}$. To see this decomposition, observe that every element ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ in ${\Gamma_0(q)}$ outside of ${\Gamma_\infty}$ can be assumed to have ${c>0}$ by applying a sign ${\pm}$, and then using the row and column operations coming from left and right multiplication by ${\Gamma_\infty}$ (that is, shifting the top row by an integer multiple of the bottom row, and shifting the right column by an integer multiple of the left column) one can place ${d}$ in the interval ${[1,c]}$ and ${(a,b)}$ to be any specified integer pair with ${ad-bc=1}$. From this we see that

$\displaystyle P_{\Gamma_\infty \backslash \Gamma_0(q)}[f] = f + \sum_{c \in {\mathbf N}: q|c} \sum_{1 \leq d \leq c: (d,c)=1} P_{\Gamma_\infty}[ f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} \cdot ) ]$

and so from further use of the unfolding formula (5) we may expand (13) as

$\displaystyle \int_{\Gamma_\infty \backslash {\mathbf H}} |f|^2\ d\mu_{\Gamma_\infty \backslash {\mathbf H}}$

$\displaystyle + \sum_{c \in {\mathbf N}} \sum_{1 \leq d \leq c: (d,c)=1} \int_{\mathbf H} \overline{f}(z) f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} z)\ d\mu_{\mathbf H}.$

The first integral is just ${m \int_0^\infty |F(y)|^2 \frac{dy}{y^2}}$. The second expression is more interesting. We have

$\displaystyle \begin{pmatrix} a & b \\ c & d \end{pmatrix} z = \frac{az+b}{cz+d} = \frac{a}{c} - \frac{1}{c(cz+d)}$

$\displaystyle = \frac{a}{c} - \frac{cx+d}{c((cx+d)^2+c^2y^2)} + \frac{iy}{(cx+d)^2 + c^2y^2}$

so we can write

$\displaystyle \int_{\mathbf H} \overline{f}(z) f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} z)\ d\mu_{\mathbf H}$

as

$\displaystyle \int_0^\infty \int_{\bf R} \overline{F}(my) F(\frac{imy}{(cx+d)^2 + c^2y^2}) e( -mx + \frac{ma}{c} - m \frac{cx+d}{c((cx+d)^2+c^2y^2)} )$

$\displaystyle \frac{dx dy}{y^2}$

which on shifting ${x}$ by ${d/c}$ simplifies a little to

$\displaystyle e( \frac{ma}{c} + \frac{md}{c} ) \int_0^\infty \int_{\bf R} F(my) \bar{F}(\frac{imy}{c^2(x^2 + y^2)}) e(- mx - m \frac{x}{c^2(x^2+y^2)} )$

$\displaystyle \frac{dx dy}{y^2}$

and then on scaling ${x,y}$ by ${m}$ simplifies a little further to

$\displaystyle e( \frac{ma}{c} + \frac{md}{c} ) \int_0^\infty \int_{\bf R} F(y) \bar{F}(\frac{m^2}{c^2} \frac{iy}{x^2 + y^2}) e(- x - \frac{m^2}{c^2} \frac{x}{x^2+y^2} )\ \frac{dx dy}{y^2}.$

Note that as ${ad-bc=1}$, we have ${a = \overline{d}}$ modulo ${c}$. Comparing the above calculations with (12), we can thus write (13) as

$\displaystyle m (\int_0^\infty |F(y)|^2 \frac{dy}{y^2} + \sum_{q|c} \frac{S(m,m;c)}{c} V(\frac{m}{c})) \ \ \ \ \ (14)$

where

$\displaystyle V(u) := \frac{1}{u} \int_0^\infty \int_{\bf R} F(y) \bar{F}(u^2 \frac{y}{x^2 + y^2}) e(- x - u^2 \frac{x}{x^2+y^2} )\ \frac{dx dy}{y^2}$

is a certain integral involving ${F}$ and a parameter ${u}$, but which does not depend explicitly on parameters such as ${m,c,d}$. Thus we have indeed expressed the ${L^2}$ expression (13) in terms of Kloosterman sums. It is possible to invert this analysis and express varius weighted sums of Kloosterman sums in terms of ${L^2}$ expressions (possibly involving inner products instead of norms) of Poincaré series, but we will not do so here; see Chapter 16 of Iwaniec and Kowalski for further details.

Traditionally, automorphic forms have been analysed using the spectral theory of the Laplace-Beltrami operator ${-\Delta}$ on spaces such as ${\Gamma\backslash {\mathbf H}}$ or ${\Gamma_\infty \backslash {\mathbf H}}$, so that a Poincaré series such as ${P_\Gamma[f]}$ might be expanded out using inner products of ${P_\Gamma[f]}$ (or, by the unfolding identities, ${f}$) with various generalised eigenfunctions of ${-\Delta}$ (such as cuspidal eigenforms, or Eisenstein series). With this approach, special functions, and specifically the modified Bessel functions ${K_{it}}$ of the second kind, play a prominent role, basically because the ${\Gamma_\infty}$-automorphic functions

$\displaystyle x+iy \mapsto y^{1/2} K_{it}(2\pi |m| y) e(mx)$

for ${t \in {\bf R}}$ and ${m \in {\bf Z}}$ non-zero are generalised eigenfunctions of ${-\Delta}$ (with eigenvalue ${\frac{1}{4}+t^2}$), and are almost square-integrable on ${\Gamma_\infty \backslash {\mathbf H}}$ (the ${L^2}$ norm diverges only logarithmically at one end ${y \rightarrow 0^+}$ of the cylinder ${\Gamma_\infty \backslash {\mathbf H}}$, while decaying exponentially fast at the other end ${y \rightarrow +\infty}$).

However, as discussed in this previous post, the spectral theory of an essentially self-adjoint operator such as ${-\Delta}$ is basically equivalent to the theory of various solution operators associated to partial differential equations involving that operator, such as the Helmholtz equation ${(-\Delta + k^2) u = f}$, the heat equation ${\partial_t u = \Delta u}$, the Schrödinger equation ${i\partial_t u + \Delta u = 0}$, or the wave equation ${\partial_{tt} u = \Delta u}$. Thus, one can hope to rephrase many arguments that involve spectral data of ${-\Delta}$ into arguments that instead involve resolvents ${(-\Delta + k^2)^{-1}}$, heat kernels ${e^{t\Delta}}$, Schrödinger propagators ${e^{it\Delta}}$, or wave propagators ${e^{\pm it\sqrt{-\Delta}}}$, or involve the PDE more directly (e.g. applying integration by parts and energy methods to solutions of such PDE). This is certainly done to some extent in the existing literature; resolvents and heat kernels, for instance, are often utilised. In this post, I would like to explore the possibility of reformulating spectral arguments instead using the inhomogeneous wave equation

$\displaystyle \partial_{tt} u - \Delta u = F.$

Actually it will be a bit more convenient to normalise the Laplacian by ${\frac{1}{4}}$, and look instead at the automorphic wave equation

$\displaystyle \partial_{tt} u + (-\Delta - \frac{1}{4}) u = F. \ \ \ \ \ (15)$

This equation somewhat resembles a “Klein-Gordon” type equation, except that the mass is imaginary! This would lead to pathological behaviour were it not for the negative curvature, which in principle creates a spectral gap of ${\frac{1}{4}}$ that cancels out this factor.

The point is that the wave equation approach gives access to some nice PDE techniques, such as energy methods, Sobolev inequalities and finite speed of propagation, which are somewhat submerged in the spectral framework. The wave equation also interacts well with Poincaré series; if for instance ${u}$ and ${F}$ are ${\Gamma_\infty}$-automorphic solutions to (15) obeying suitable decay conditions, then their Poincaré series ${P_{\Gamma_\infty \backslash \Gamma}[u]}$ and ${P_{\Gamma_\infty \backslash \Gamma}[F]}$ will be ${\Gamma}$-automorphic solutions to the same equation (15), basically because the Laplace-Beltrami operator commutes with translations. Because of these facts, it is possible to replicate several standard spectral theory arguments in the wave equation framework, without having to deal directly with things like the asymptotics of modified Bessel functions. The wave equation approach to automorphic theory was introduced by Faddeev and Pavlov (using the Lax-Phillips scattering theory), and developed further by by Lax and Phillips, to recover many spectral facts about the Laplacian on modular curves, such as the Weyl law and the Selberg trace formula. Here, I will illustrate this by deriving three basic applications of automorphic methods in a wave equation framework, namely

• Using the Weil bound on Kloosterman sums to derive Selberg’s 3/16 theorem on the least non-trivial eigenvalue for ${-\Delta}$ on ${\Gamma_0(q) \backslash {\mathbf H}}$ (discussed previously here);
• Conversely, showing that Selberg’s eigenvalue conjecture (improving Selberg’s ${3/16}$ bound to the optimal ${1/4}$) implies an optimal bound on (smoothed) sums of Kloosterman sums; and
• Using the same bound to obtain pointwise bounds on Poincaré series similar to the ones discussed above. (Actually, the argument here does not use the wave equation, instead it just uses the Sobolev inequality.)

This post originated from an attempt to finally learn this part of analytic number theory properly, and to see if I could use a PDE-based perspective to understand it better. Ultimately, this is not that dramatic a depature from the standard approach to this subject, but I found it useful to think of things in this fashion, probably due to my existing background in PDE.

I thank Bill Duke and Ben Green for helpful discussions. My primary reference for this theory was Chapters 15, 16, and 21 of Iwaniec and Kowalski.

The Euler equations for three-dimensional incompressible inviscid fluid flow are

$\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p \ \ \ \ \ (1)$

$\displaystyle \nabla \cdot u = 0$

where ${u: {\bf R} \times {\bf R}^3 \rightarrow {\bf R}^3}$ is the velocity field, and ${p: {\bf R} \times {\bf R}^3 \rightarrow {\bf R}}$ is the pressure field. For the purposes of this post, we will ignore all issues of decay or regularity of the fields in question, assuming that they are as smooth and rapidly decreasing as needed to justify all the formal calculations here; in particular, we will apply inverse operators such as ${(-\Delta)^{-1}}$ or ${|\nabla|^{-1} := (-\Delta)^{-1/2}}$ formally, assuming that these inverses are well defined on the functions they are applied to.

Meanwhile, the surface quasi-geostrophic (SQG) equation is given by

$\displaystyle \partial_t \theta + (u \cdot \nabla) \theta = 0 \ \ \ \ \ (2)$

$\displaystyle u = ( -\partial_y |\nabla|^{-1}, \partial_x |\nabla|^{-1} ) \theta \ \ \ \ \ (3)$

where ${\theta: {\bf R} \times {\bf R}^2 \rightarrow {\bf R}}$ is the active scalar, and ${u: {\bf R} \times {\bf R}^2 \rightarrow {\bf R}^2}$ is the velocity field. The SQG equations are often used as a toy model for the 3D Euler equations, as they share many of the same features (e.g. vortex stretching); see this paper of Constantin, Majda, and Tabak for more discussion (or this previous blog post).

I recently found a more direct way to connect the two equations. We first recall that the Euler equations can be placed in vorticity-stream form by focusing on the vorticity ${\omega := \nabla \times u}$. Indeed, taking the curl of (1), we obtain the vorticity equation

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (4)$

while the velocity ${u}$ can be recovered from the vorticity via the Biot-Savart law

$\displaystyle u = (-\Delta)^{-1} \nabla \times \omega. \ \ \ \ \ (5)$

The system (4), (5) has some features in common with the system (2), (3); in (2) it is a scalar field ${\theta}$ that is being transported by a divergence-free vector field ${u}$, which is a linear function of the scalar field as per (3), whereas in (4) it is a vector field ${\omega}$ that is being transported (in the Lie derivative sense) by a divergence-free vector field ${u}$, which is a linear function of the vector field as per (5). However, the system (4), (5) is in three dimensions whilst (2), (3) is in two spatial dimensions, the dynamical field is a scalar field ${\theta}$ for SQG and a vector field ${\omega}$ for Euler, and the relationship between the velocity field and the dynamical field is given by a zeroth order Fourier multiplier in (3) and a ${-1^{th}}$ order operator in (5).

However, we can make the two equations more closely resemble each other as follows. We first consider the generalisation

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (6)$

$\displaystyle u = T (-\Delta)^{-1} \nabla \times \omega \ \ \ \ \ (7)$

where ${T}$ is an invertible, self-adjoint, positive-definite zeroth order Fourier multiplier that maps divergence-free vector fields to divergence-free vector fields. The Euler equations then correspond to the case when ${T}$ is the identity operator. As discussed in this previous blog post (which used ${A}$ to denote the inverse of the operator denoted here as ${T}$), this generalised Euler system has many of the same features as the original Euler equation, such as a conserved Hamiltonian

$\displaystyle \frac{1}{2} \int_{{\bf R}^3} u \cdot T^{-1} u,$

the Kelvin circulation theorem, and conservation of helicity

$\displaystyle \int_{{\bf R}^3} \omega \cdot T^{-1} u.$

Also, if we require ${\omega}$ to be divergence-free at time zero, it remains divergence-free at all later times.

Let us consider “two-and-a-half-dimensional” solutions to the system (6), (7), in which ${u,\omega}$ do not depend on the vertical coordinate ${z}$, thus

$\displaystyle \omega(t,x,y,z) = \omega(t,x,y)$

and

$\displaystyle u(t,x,y,z) = u(t,x,y)$

but we allow the vertical components ${u_z, \omega_z}$ to be non-zero. For this to be consistent, we also require ${T}$ to commute with translations in the ${z}$ direction. As all derivatives in the ${z}$ direction now vanish, we can simplify (6) to

$\displaystyle D_t \omega = (\omega_x \partial_x + \omega_y \partial_y) u \ \ \ \ \ (8)$

where ${D_t}$ is the two-dimensional material derivative

$\displaystyle D_t := \partial_t + u_x \partial_x + u_y \partial_y.$

Also, divergence-free nature of ${\omega,u}$ then becomes

$\displaystyle \partial_x \omega_x + \partial_y \omega_y = 0$

and

$\displaystyle \partial_x u_x + \partial_y u_y = 0. \ \ \ \ \ (9)$

In particular, we may (formally, at least) write

$\displaystyle (\omega_x, \omega_y) = (\partial_y \theta, -\partial_x \theta)$

for some scalar field ${\theta(t,x,y,z) = \theta(t,x,y)}$, so that (7) becomes

$\displaystyle u = T ( (- \Delta)^{-1} \partial_y \omega_z, - (-\Delta^{-1}) \partial_x \omega_z, \theta ). \ \ \ \ \ (10)$

The first two components of (8) become

$\displaystyle D_t \partial_y \theta = \partial_y \theta \partial_x u_x - \partial_x \theta \partial_y u_x$

$\displaystyle - D_t \partial_x \theta = \partial_y \theta \partial_x u_y - \partial_x \theta \partial_y u_y$

which rearranges using (9) to

$\displaystyle \partial_y D_t \theta = \partial_x D_t \theta = 0.$

Formally, we may integrate this system to obtain the transport equation

$\displaystyle D_t \theta = 0, \ \ \ \ \ (11)$

Finally, the last component of (8) is

$\displaystyle D_t \omega_z = \partial_y \theta \partial_x u_z - \partial_x \theta \partial_y u_z. \ \ \ \ \ (12)$

At this point, we make the following choice for ${T}$:

$\displaystyle T ( U_x, U_y, \theta ) = \alpha (U_x, U_y, \theta) + (-\partial_y |\nabla|^{-1} \theta, \partial_x |\nabla|^{-1} \theta, 0) \ \ \ \ \ (13)$

$\displaystyle + P( 0, 0, |\nabla|^{-1} (\partial_y U_x - \partial_x U_y) )$

where ${\alpha > 0}$ is a real constant and ${Pu := (-\Delta)^{-1} (\nabla \times (\nabla \times u))}$ is the Leray projection onto divergence-free vector fields. One can verify that for large enough ${\alpha}$, ${T}$ is a self-adjoint positive definite zeroth order Fourier multiplier from divergence free vector fields to divergence-free vector fields. With this choice, we see from (10) that

$\displaystyle u_z = \alpha \theta - |\nabla|^{-1} \omega_z$

so that (12) simplifies to

$\displaystyle D_t \omega_z = - \partial_y \theta \partial_x |\nabla|^{-1} \omega_z + \partial_x \theta \partial_y |\nabla|^{-1} \omega_z.$

This implies (formally at least) that if ${\omega_z}$ vanishes at time zero, then it vanishes for all time. Setting ${\omega_z=0}$, we then have from (10) that

$\displaystyle (u_x,u_y,u_z) = (-\partial_y |\nabla|^{-1} \theta, \partial_x |\nabla|^{-1} \theta, \alpha \theta )$

and from (11) we then recover the SQG system (2), (3). To put it another way, if ${\theta(t,x,y)}$ and ${u(t,x,y)}$ solve the SQG system, then by setting

$\displaystyle \omega(t,x,y,z) := ( \partial_y \theta(t,x,y), -\partial_x \theta(t,x,y), 0 )$

$\displaystyle \tilde u(t,x,y,z) := ( u_x(t,x,y), u_y(t,x,y), \alpha \theta(t,x,y) )$

then ${\omega,\tilde u}$ solve the modified Euler system (6), (7) with ${T}$ given by (13).

We have ${T^{-1} \tilde u = (0, 0, \theta)}$, so the Hamiltonian ${\frac{1}{2} \int_{{\bf R}^3} \tilde u \cdot T^{-1} \tilde u}$ for the modified Euler system in this case is formally a scalar multiple of the conserved quantity ${\int_{{\bf R}^2} \theta^2}$. The momentum ${\int_{{\bf R}^3} x \cdot \tilde u}$ for the modified Euler system is formally a scalar multiple of the conserved quantity ${\int_{{\bf R}^2} \theta}$, while the vortex stream lines that are preserved by the modified Euler flow become the level sets of the active scalar that are preserved by the SQG flow. On the other hand, the helicity ${\int_{{\bf R}^3} \omega \cdot T^{-1} \tilde u}$ vanishes, and other conserved quantities for SQG (such as the Hamiltonian ${\int_{{\bf R}^2} \theta |\nabla|^{-1} \theta}$) do not seem to correspond to conserved quantities of the modified Euler system. This is not terribly surprising; a low-dimensional flow may well have a richer family of conservation laws than the higher-dimensional system that it is embedded in.

The wave equation is usually expressed in the form

$\displaystyle \partial_{tt} u - \Delta u = 0$

where ${u \colon {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ is a function of both time ${t \in {\bf R}}$ and space ${x \in {\bf R}^d}$, with ${\Delta}$ being the Laplacian operator. One can generalise this equation in a number of ways, for instance by replacing the spatial domain ${{\bf R}^d}$ with some other manifold and replacing the Laplacian ${\Delta}$ with the Laplace-Beltrami operator or adding lower order terms (such as a potential, or a coupling with a magnetic field). But for sake of discussion let us work with the classical wave equation on ${{\bf R}^d}$. We will work formally in this post, being unconcerned with issues of convergence, justifying interchange of integrals, derivatives, or limits, etc.. One then has a conserved energy

$\displaystyle \int_{{\bf R}^d} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx$

which we can rewrite using integration by parts and the ${L^2}$ inner product ${\langle, \rangle}$ on ${{\bf R}^d}$ as

$\displaystyle \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle \partial_t u(t), \partial_t u(t) \rangle.$

A key feature of the wave equation is finite speed of propagation: if, at time ${t=0}$ (say), the initial position ${u(0)}$ and initial velocity ${\partial_t u(0)}$ are both supported in a ball ${B(x_0,R) := \{ x \in {\bf R}^d: |x-x_0| \leq R \}}$, then at any later time ${t>0}$, the position ${u(t)}$ and velocity ${\partial_t u(t)}$ are supported in the larger ball ${B(x_0,R+t)}$. This can be seen for instance (formally, at least) by inspecting the exterior energy

$\displaystyle \int_{|x-x_0| > R+t} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx$

and observing (after some integration by parts and differentiation under the integral sign) that it is non-increasing in time, non-negative, and vanishing at time ${t=0}$.

The wave equation is second order in time, but one can turn it into a first order system by working with the pair ${(u(t),v(t))}$ rather than just the single field ${u(t)}$, where ${v(t) := \partial_t u(t)}$ is the velocity field. The system is then

$\displaystyle \partial_t u(t) = v(t)$

$\displaystyle \partial_t v(t) = \Delta u(t)$

and the conserved energy is now

$\displaystyle \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (1)$

Finite speed of propagation then tells us that if ${u(0),v(0)}$ are both supported on ${B(x_0,R)}$, then ${u(t),v(t)}$ are supported on ${B(x_0,R+t)}$ for all ${t>0}$. One also has time reversal symmetry: if ${t \mapsto (u(t),v(t))}$ is a solution, then ${t \mapsto (u(-t), -v(-t))}$ is a solution also, thus for instance one can establish an analogue of finite speed of propagation for negative times ${t<0}$ using this symmetry.

If one has an eigenfunction

$\displaystyle -\Delta \phi = \lambda^2 \phi$

of the Laplacian, then we have the explicit solutions

$\displaystyle u(t) = e^{\pm it \lambda} \phi$

$\displaystyle v(t) = \pm i \lambda e^{\pm it \lambda} \phi$

of the wave equation, which formally can be used to construct all other solutions via the principle of superposition.

When one has vanishing initial velocity ${v(0)=0}$, the solution ${u(t)}$ is given via functional calculus by

$\displaystyle u(t) = \cos(t \sqrt{-\Delta}) u(0)$

and the propagator ${\cos(t \sqrt{-\Delta})}$ can be expressed as the average of half-wave operators:

$\displaystyle \cos(t \sqrt{-\Delta}) = \frac{1}{2} ( e^{it\sqrt{-\Delta}} + e^{-it\sqrt{-\Delta}} ).$

One can view ${\cos(t \sqrt{-\Delta} )}$ as a minor of the full wave propagator

$\displaystyle U(t) := \exp \begin{pmatrix} 0 & t \\ t\Delta & 0 \end{pmatrix}$

$\displaystyle = \begin{pmatrix} \cos(t \sqrt{-\Delta}) & \frac{\sin(t\sqrt{-\Delta})}{\sqrt{-\Delta}} \\ \sin(t\sqrt{-\Delta}) \sqrt{-\Delta} & \cos(t \sqrt{-\Delta} ) \end{pmatrix}$

which is unitary with respect to the energy form (1), and is the fundamental solution to the wave equation in the sense that

$\displaystyle \begin{pmatrix} u(t) \\ v(t) \end{pmatrix} = U(t) \begin{pmatrix} u(0) \\ v(0) \end{pmatrix}. \ \ \ \ \ (2)$

Viewing the contraction ${\cos(t\sqrt{-\Delta})}$ as a minor of a unitary operator is an instance of the “dilation trick“.

It turns out (as I learned from Yuval Peres) that there is a useful discrete analogue of the wave equation (and of all of the above facts), in which the time variable ${t}$ now lives on the integers ${{\bf Z}}$ rather than on ${{\bf R}}$, and the spatial domain can be replaced by discrete domains also (such as graphs). Formally, the system is now of the form

$\displaystyle u(t+1) = P u(t) + v(t) \ \ \ \ \ (3)$

$\displaystyle v(t+1) = P v(t) - (1-P^2) u(t)$

where ${t}$ is now an integer, ${u(t), v(t)}$ take values in some Hilbert space (e.g. ${\ell^2}$ functions on a graph ${G}$), and ${P}$ is some operator on that Hilbert space (which in applications will usually be a self-adjoint contraction). To connect this with the classical wave equation, let us first consider a rescaling of this system

$\displaystyle u(t+\varepsilon) = P_\varepsilon u(t) + \varepsilon v(t)$

$\displaystyle v(t+\varepsilon) = P_\varepsilon v(t) - \frac{1}{\varepsilon} (1-P_\varepsilon^2) u(t)$

where ${\varepsilon>0}$ is a small parameter (representing the discretised time step), ${t}$ now takes values in the integer multiples ${\varepsilon {\bf Z}}$ of ${\varepsilon}$, and ${P_\varepsilon}$ is the wave propagator operator ${P_\varepsilon := \cos( \varepsilon \sqrt{-\Delta} )}$ or the heat propagator ${P_\varepsilon := \exp( - \varepsilon^2 \Delta/2 )}$ (the two operators are different, but agree to fourth order in ${\varepsilon}$). One can then formally verify that the wave equation emerges from this rescaled system in the limit ${\varepsilon \rightarrow 0}$. (Thus, ${P}$ is not exactly the direct analogue of the Laplacian ${\Delta}$, but can be viewed as something like ${P_\varepsilon = 1 - \frac{\varepsilon^2}{2} \Delta + O( \varepsilon^4 )}$ in the case of small ${\varepsilon}$, or ${P = 1 - \frac{1}{2}\Delta + O(\Delta^2)}$ if we are not rescaling to the small ${\varepsilon}$ case. The operator ${P}$ is sometimes known as the diffusion operator)

Assuming ${P}$ is self-adjoint, solutions to the system (3) formally conserve the energy

$\displaystyle \frac{1}{2} \langle (1-P^2) u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (4)$

This energy is positive semi-definite if ${P}$ is a contraction. We have the same time reversal symmetry as before: if ${t \mapsto (u(t),v(t))}$ solves the system (3), then so does ${t \mapsto (u(-t), -v(-t))}$. If one has an eigenfunction

$\displaystyle P \phi = \cos(\lambda) \phi$

to the operator ${P}$, then one has an explicit solution

$\displaystyle u(t) = e^{\pm it \lambda} \phi$

$\displaystyle v(t) = \pm i \sin(\lambda) e^{\pm it \lambda} \phi$

to (3), and (in principle at least) this generates all other solutions via the principle of superposition.

Finite speed of propagation is a lot easier in the discrete setting, though one has to offset the support of the “velocity” field ${v}$ by one unit. Suppose we know that ${P}$ has unit speed in the sense that whenever ${f}$ is supported in a ball ${B(x,R)}$, then ${Pf}$ is supported in the ball ${B(x,R+1)}$. Then an easy induction shows that if ${u(0), v(0)}$ are supported in ${B(x_0,R), B(x_0,R+1)}$ respectively, then ${u(t), v(t)}$ are supported in ${B(x_0,R+t), B(x_0, R+t+1)}$.

The fundamental solution ${U(t) = U^t}$ to the discretised wave equation (3), in the sense of (2), is given by the formula

$\displaystyle U(t) = U^t = \begin{pmatrix} P & 1 \\ P^2-1 & P \end{pmatrix}^t$

$\displaystyle = \begin{pmatrix} T_t(P) & U_{t-1}(P) \\ (P^2-1) U_{t-1}(P) & T_t(P) \end{pmatrix}$

where ${T_t}$ and ${U_t}$ are the Chebyshev polynomials of the first and second kind, thus

$\displaystyle T_t( \cos \theta ) = \cos(t\theta)$

and

$\displaystyle U_t( \cos \theta ) = \frac{\sin((t+1)\theta)}{\sin \theta}.$

In particular, ${P}$ is now a minor of ${U(1) = U}$, and can also be viewed as an average of ${U}$ with its inverse ${U^{-1}}$:

$\displaystyle \begin{pmatrix} P & 0 \\ 0 & P \end{pmatrix} = \frac{1}{2} (U + U^{-1}). \ \ \ \ \ (5)$

As before, ${U}$ is unitary with respect to the energy form (4), so this is another instance of the dilation trick in action. The powers ${P^n}$ and ${U^n}$ are discrete analogues of the heat propagators ${e^{t\Delta/2}}$ and wave propagators ${U(t)}$ respectively.

One nice application of all this formalism, which I learned from Yuval Peres, is the Varopoulos-Carne inequality:

Theorem 1 (Varopoulos-Carne inequality) Let ${G}$ be a (possibly infinite) regular graph, let ${n \geq 1}$, and let ${x, y}$ be vertices in ${G}$. Then the probability that the simple random walk at ${x}$ lands at ${y}$ at time ${n}$ is at most ${2 \exp( - d(x,y)^2 / 2n )}$, where ${d}$ is the graph distance.

This general inequality is quite sharp, as one can see using the standard Cayley graph on the integers ${{\bf Z}}$. Very roughly speaking, it asserts that on a regular graph of reasonably controlled growth (e.g. polynomial growth), random walks of length ${n}$ concentrate on the ball of radius ${O(\sqrt{n})}$ or so centred at the origin of the random walk.

Proof: Let ${P \colon \ell^2(G) \rightarrow \ell^2(G)}$ be the graph Laplacian, thus

$\displaystyle Pf(x) = \frac{1}{D} \sum_{y \sim x} f(y)$

for any ${f \in \ell^2(G)}$, where ${D}$ is the degree of the regular graph and sum is over the ${D}$ vertices ${y}$ that are adjacent to ${x}$. This is a contraction of unit speed, and the probability that the random walk at ${x}$ lands at ${y}$ at time ${n}$ is

$\displaystyle \langle P^n \delta_x, \delta_y \rangle$

where ${\delta_x, \delta_y}$ are the Dirac deltas at ${x,y}$. Using (5), we can rewrite this as

$\displaystyle \langle (\frac{1}{2} (U + U^{-1}))^n \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle$

where we are now using the energy form (4). We can write

$\displaystyle (\frac{1}{2} (U + U^{-1}))^n = {\bf E} U^{S_n}$

where ${S_n}$ is the simple random walk of length ${n}$ on the integers, that is to say ${S_n = \xi_1 + \dots + \xi_n}$ where ${\xi_1,\dots,\xi_n = \pm 1}$ are independent uniform Bernoulli signs. Thus we wish to show that

$\displaystyle {\bf E} \langle U^{S_n} \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle \leq 2 \exp(-d(x,y)^2 / 2n ).$

By finite speed of propagation, the inner product here vanishes if ${|S_n| < d(x,y)}$. For ${|S_n| \geq d(x,y)}$ we can use Cauchy-Schwarz and the unitary nature of ${U}$ to bound the inner product by ${1}$. Thus the left-hand side may be upper bounded by

$\displaystyle {\bf P}( |S_n| \geq d(x,y) )$

and the claim now follows from the Chernoff inequality. $\Box$

This inequality has many applications, particularly with regards to relating the entropy, mixing time, and concentration of random walks with volume growth of balls; see this text of Lyons and Peres for some examples.

For sake of comparison, here is a continuous counterpart to the Varopoulos-Carne inequality:

Theorem 2 (Continuous Varopoulos-Carne inequality) Let ${t > 0}$, and let ${f,g \in L^2({\bf R}^d)}$ be supported on compact sets ${F,G}$ respectively. Then

$\displaystyle |\langle e^{t\Delta/2} f, g \rangle| \leq \sqrt{\frac{2t}{\pi d(F,G)^2}} \exp( - d(F,G)^2 / 2t ) \|f\|_{L^2} \|g\|_{L^2}$

where ${d(F,G)}$ is the Euclidean distance between ${F}$ and ${G}$.

Proof: By Fourier inversion one has

$\displaystyle e^{-t\xi^2/2} = \frac{1}{\sqrt{2\pi t}} \int_{\bf R} e^{-s^2/2t} e^{is\xi}\ ds$

$\displaystyle = \sqrt{\frac{2}{\pi t}} \int_0^\infty e^{-s^2/2t} \cos(s \xi )\ ds$

for any real ${\xi}$, and thus

$\displaystyle \langle e^{t\Delta/2} f, g\rangle = \sqrt{\frac{2}{\pi}} \int_0^\infty e^{-s^2/2t} \langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds.$

By finite speed of propagation, the inner product ${\langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds}$ vanishes when ${s < d(F,G)}$; otherwise, we can use Cauchy-Schwarz and the contractive nature of ${\cos(s \sqrt{-\Delta} )}$ to bound this inner product by ${\|f\|_{L^2} \|g\|_{L^2}}$. Thus

$\displaystyle |\langle e^{t\Delta/2} f, g\rangle| \leq \sqrt{\frac{2}{\pi t}} \|f\|_{L^2} \|g\|_{L^2} \int_{d(F,G)}^\infty e^{-s^2/2t}\ ds.$

Bounding ${e^{-s^2/2t}}$ by ${e^{-d(F,G)^2/2t} e^{-d(F,G) (s-d(F,G))/t}}$, we obtain the claim. $\Box$

Observe that the argument is quite general and can be applied for instance to other Riemannian manifolds than ${{\bf R}^d}$.

Many fluid equations are expected to exhibit turbulence in their solutions, in which a significant portion of their energy ends up in high frequency modes. A typical example arises from the three-dimensional periodic Navier-Stokes equations

$\displaystyle \partial_t u + u \cdot \nabla u = \nu \Delta u + \nabla p + f$

$\displaystyle \nabla \cdot u = 0$

where ${u: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3}$ is the velocity field, ${f: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3}$ is a forcing term, ${p: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}}$ is a pressure field, and ${\nu > 0}$ is the viscosity. To study the dynamics of energy for this system, we first pass to the Fourier transform

$\displaystyle \hat u(t,k) := \int_{{\bf R}^3/{\bf Z}^3} u(t,x) e^{-2\pi i k \cdot x}$

so that the system becomes

$\displaystyle \partial_t \hat u(t,k) + 2\pi \sum_{k = k_1 + k_2} (\hat u(t,k_1) \cdot ik_2) \hat u(t,k_2) =$

$\displaystyle - 4\pi^2 \nu |k|^2 \hat u(t,k) + 2\pi ik \hat p(t,k) + \hat f(t,k) \ \ \ \ \ (1)$

$\displaystyle k \cdot \hat u(t,k) = 0.$

We may normalise ${u}$ (and ${f}$) to have mean zero, so that ${\hat u(t,0)=0}$. Then we introduce the dyadic energies

$\displaystyle E_N(t) := \sum_{|k| \sim N} |\hat u(t,k)|^2$

where ${N \geq 1}$ ranges over the powers of two, and ${|k| \sim N}$ is shorthand for ${N \leq |k| < 2N}$. Taking the inner product of (1) with ${\hat u(t,k)}$, we obtain the energy flow equation

$\displaystyle \partial_t E_N = \sum_{N_1,N_2} \Pi_{N,N_1,N_2} - D_N + F_N \ \ \ \ \ (2)$

where ${N_1,N_2}$ range over powers of two, ${\Pi_{N,N_1,N_2}}$ is the energy flow rate

$\displaystyle \Pi_{N,N_1,N_2} := -2\pi \sum_{k=k_1+k_2: |k| \sim N, |k_1| \sim N_1, |k_2| \sim N_2}$

$\displaystyle (\hat u(t,k_1) \cdot ik_2) (\hat u(t,k) \cdot \hat u(t,k_2)),$

${D_N}$ is the energy dissipation rate

$\displaystyle D_N := 4\pi^2 \nu \sum_{|k| \sim N} |k|^2 |\hat u(t,k)|^2$

and ${F_N}$ is the energy injection rate

$\displaystyle F_N := \sum_{|k| \sim N} \hat u(t,k) \cdot \hat f(t,k).$

The Navier-Stokes equations are notoriously difficult to solve in general. Despite this, Kolmogorov in 1941 was able to give a convincing heuristic argument for what the distribution of the dyadic energies ${E_N}$ should become over long times, assuming that some sort of distributional steady state is reached. It is common to present this argument in the form of dimensional analysis, but one can also give a more “first principles” form Kolmogorov’s argument, which I will do here. Heuristically, one can divide the frequency scales ${N}$ into three regimes:

• The injection regime in which the energy injection rate ${F_N}$ dominates the right-hand side of (2);
• The energy flow regime in which the flow rates ${\Pi_{N,N_1,N_2}}$ dominate the right-hand side of (2); and
• The dissipation regime in which the dissipation ${D_N}$ dominates the right-hand side of (2).

If we assume a fairly steady and smooth forcing term ${f}$, then ${\hat f}$ will be supported on the low frequency modes ${k=O(1)}$, and so we heuristically expect the injection regime to consist of the low scales ${N=O(1)}$. Conversely, if we take the viscosity ${\nu}$ to be small, we expect the dissipation regime to only occur for very large frequencies ${N}$, with the energy flow regime occupying the intermediate frequencies.

We can heuristically predict the dividing line between the energy flow regime. Of all the flow rates ${\Pi_{N,N_1,N_2}}$, it turns out in practice that the terms in which ${N_1,N_2 = N+O(1)}$ (i.e., interactions between comparable scales, rather than widely separated scales) will dominate the other flow rates, so we will focus just on these terms. It is convenient to return back to physical space, decomposing the velocity field ${u}$ into Littlewood-Paley components

$\displaystyle u_N(t,x) := \sum_{|k| \sim N} \hat u(t,k) e^{2\pi i k \cdot x}$

of the velocity field ${u(t,x)}$ at frequency ${N}$. By Plancherel’s theorem, this field will have an ${L^2}$ norm of ${E_N(t)^{1/2}}$, and as a naive model of turbulence we expect this field to be spread out more or less uniformly on the torus, so we have the heuristic

$\displaystyle |u_N(t,x)| = O( E_N(t)^{1/2} ),$

and a similar heuristic applied to ${\nabla u_N}$ gives

$\displaystyle |\nabla u_N(t,x)| = O( N E_N(t)^{1/2} ).$

(One can consider modifications of the Kolmogorov model in which ${u_N}$ is concentrated on a lower-dimensional subset of the three-dimensional torus, leading to some changes in the numerology below, but we will not consider such variants here.) Since

$\displaystyle \Pi_{N,N_1,N_2} = - \int_{{\bf R}^3/{\bf Z}^3} u_N \cdot ( (u_{N_1} \cdot \nabla) u_{N_2} )\ dx$

we thus arrive at the heuristic

$\displaystyle \Pi_{N,N_1,N_2} = O( N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} ).$

Of course, there is the possibility that due to significant cancellation, the energy flow is significantly less than ${O( N E_N(t)^{3/2} )}$, but we will assume that cancellation effects are not that significant, so that we typically have

$\displaystyle \Pi_{N,N_1,N_2} \sim N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} \ \ \ \ \ (3)$

or (assuming that ${E_N}$ does not oscillate too much in ${N}$, and ${N_1,N_2}$ are close to ${N}$)

$\displaystyle \Pi_{N,N_1,N_2} \sim N E_N^{3/2}.$

On the other hand, we clearly have

$\displaystyle D_N \sim \nu N^2 E_N.$

We thus expect to be in the dissipation regime when

$\displaystyle N \gtrsim \nu^{-1} E_N^{1/2} \ \ \ \ \ (4)$

and in the energy flow regime when

$\displaystyle 1 \lesssim N \lesssim \nu^{-1} E_N^{1/2}. \ \ \ \ \ (5)$

Now we study the energy flow regime further. We assume a “statistically scale-invariant” dynamics in this regime, in particular assuming a power law

$\displaystyle E_N \sim A N^{-\alpha} \ \ \ \ \ (6)$

for some ${A,\alpha > 0}$. From (3), we then expect an average asymptotic of the form

$\displaystyle \Pi_{N,N_1,N_2} \approx A^{3/2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \ \ \ \ \ (7)$

for some structure constants ${c_{N,N_1,N_2} \sim 1}$ that depend on the exact nature of the turbulence; here we have replaced the factor ${N_2}$ by the comparable term ${(N N_1 N_2)^{1/3}}$ to make things more symmetric. In order to attain a steady state in the energy flow regime, we thus need a cancellation in the structure constants:

$\displaystyle \sum_{N_1,N_2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \approx 0. \ \ \ \ \ (8)$

On the other hand, if one is assuming statistical scale invariance, we expect the structure constants to be scale-invariant (in the energy flow regime), in that

$\displaystyle c_{\lambda N, \lambda N_1, \lambda N_2} = c_{N,N_1,N_2} \ \ \ \ \ (9)$

for dyadic ${\lambda > 0}$. Also, since the Euler equations conserve energy, the energy flows ${\Pi_{N,N_1,N_2}}$ symmetrise to zero,

$\displaystyle \Pi_{N,N_1,N_2} + \Pi_{N,N_2,N_1} + \Pi_{N_1,N,N_2} + \Pi_{N_1,N_2,N} + \Pi_{N_2,N,N_1} + \Pi_{N_2,N_1,N} = 0,$

which from (7) suggests a similar cancellation among the structure constants

$\displaystyle c_{N,N_1,N_2} + c_{N,N_2,N_1} + c_{N_1,N,N_2} + c_{N_1,N_2,N} + c_{N_2,N,N_1} + c_{N_2,N_1,N} \approx 0.$

Combining this with the scale-invariance (9), we see that for fixed ${N}$, we may organise the structure constants ${c_{N,N_1,N_2}}$ for dyadic ${N_1,N_2}$ into sextuples which sum to zero (including some degenerate tuples of order less than six). This will automatically guarantee the cancellation (8) required for a steady state energy distribution, provided that

$\displaystyle \frac{1}{3} - \frac{\alpha}{2} = 0$

or in other words

$\displaystyle \alpha = \frac{2}{3};$

for any other value of ${\alpha}$, there is no particular reason to expect this cancellation (8) to hold. Thus we are led to the heuristic conclusion that the most stable power law distribution for the energies ${E_N}$ is the ${2/3}$ law

$\displaystyle E_N \sim A N^{-2/3} \ \ \ \ \ (10)$

or in terms of shell energies, we have the famous Kolmogorov 5/3 law

$\displaystyle \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim A k_0^{-5/3}.$

Given that frequency interactions tend to cascade from low frequencies to high (if only because there are so many more high frequencies than low ones), the above analysis predicts a stablising effect around this power law: scales at which a law (6) holds for some ${\alpha > 2/3}$ are likely to lose energy in the near-term, while scales at which a law (6) hold for some ${\alpha< 2/3}$ are conversely expected to gain energy, thus nudging the exponent of power law towards ${2/3}$.

We can solve for ${A}$ in terms of energy dissipation as follows. If we let ${N_*}$ be the frequency scale demarcating the transition from the energy flow regime (5) to the dissipation regime (4), we have

$\displaystyle N_* \sim \nu^{-1} E_{N_*}$

and hence by (10)

$\displaystyle N_* \sim \nu^{-1} A N_*^{-2/3}.$

On the other hand, if we let ${\epsilon := D_{N_*}}$ be the energy dissipation at this scale ${N_*}$ (which we expect to be the dominant scale of energy dissipation), we have

$\displaystyle \epsilon \sim \nu N_*^2 E_N \sim \nu N_*^2 A N_*^{-2/3}.$

Some simple algebra then lets us solve for ${A}$ and ${N_*}$ as

$\displaystyle N_* \sim (\frac{\epsilon}{\nu^3})^{1/4}$

and

$\displaystyle A \sim \epsilon^{2/3}.$

Thus, we have the Kolmogorov prediction

$\displaystyle \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim \epsilon^{2/3} k_0^{-5/3}$

for

$\displaystyle 1 \lesssim k_0 \lesssim (\frac{\epsilon}{\nu^3})^{1/4}$

with energy dissipation occuring at the high end ${k_0 \sim (\frac{\epsilon}{\nu^3})^{1/4}}$ of this scale, which is counterbalanced by the energy injection at the low end ${k_0 \sim 1}$ of the scale.

As in the previous post, all computations here are at the formal level only.

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with Lagrangian space ${{\cal L} = ({\bf R}^n, \hbox{vol})}$ and Eulerian space ${{\cal E} = ({\bf R}^n, \eta, \hbox{vol})}$, we let ${M}$ be the space of volume-preserving, orientation-preserving maps ${\Phi: {\cal L} \rightarrow {\cal E}}$ from Lagrangian space to Eulerian space. Given a curve ${\Phi: {\bf R} \rightarrow M}$, we can define the Lagrangian velocity field ${\dot \Phi: {\bf R} \times {\cal L} \rightarrow T{\cal E}}$ as the time derivative of ${\Phi}$, and the Eulerian velocity field ${u := \dot \Phi \circ \Phi^{-1}: {\bf R} \times {\cal E} \rightarrow T{\cal E}}$. The volume-preserving nature of ${\Phi}$ ensures that ${u}$ is a divergence-free vector field:

$\displaystyle \nabla \cdot u = 0. \ \ \ \ \ (1)$

If we formally define the functional

$\displaystyle J[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} |u(t,x)|^2\ dx dt = \frac{1}{2} \int_R \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx dt$

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

$\displaystyle [\partial_t + u \cdot \nabla] u = - \nabla p$

$\displaystyle \nabla \cdot u = 0$

for some pressure field ${p: {\bf R} \times {\cal E} \rightarrow {\bf R}}$. As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

$\displaystyle \frac{1}{2} \int_{{\cal E}} |u(t,x)|^2\ dx = \frac{1}{2} \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx;$

the rigid motion symmetries of Eulerian space give conservation of the total momentum

$\displaystyle \int_{{\cal E}} u(t,x)\ dx$

and total angular momentum

$\displaystyle \int_{{\cal E}} x \wedge u(t,x)\ dx;$

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

$\displaystyle \int_{\Phi(\gamma)} u^*$

for any closed loop ${\gamma}$ in ${{\cal L}}$, or equivalently pointwise conservation of the Lagrangian vorticity ${\Phi^* \omega = \Phi^* du^*}$, where ${u^*}$ is the ${1}$-form associated with the vector field ${u}$ using the Euclidean metric ${\eta}$ on ${{\cal E}}$, with ${\Phi^*}$ denoting pullback by ${\Phi}$.

It turns out that one can generalise the above calculations. Given any self-adjoint operator ${A}$ on divergence-free vector fields ${u: {\cal E} \rightarrow {\bf R}}$, we can define the functional

$\displaystyle J_A[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx dt;$

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

$\displaystyle [\partial_t + u \cdot \nabla] Au + (\nabla u) \cdot Au= - \nabla \tilde p \ \ \ \ \ (2)$

$\displaystyle \nabla \cdot u = 0$

for some pressure field ${\tilde p: {\bf R} \times {\cal E} \rightarrow {\bf R}}$, where ${(\nabla u) \cdot Au}$ in coordinates is ${\partial_i u_j Au_j}$ with the usual summation conventions. (When ${A=1}$, ${(\nabla u) \cdot Au = \nabla(\frac{1}{2} |u|^2)}$, and this term can be absorbed into the pressure ${\tilde p}$, and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

$\displaystyle \frac{1}{2} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx.$

If the operator ${A}$ commutes with rigid motions on ${{\cal E}}$, then we have conservation of total momentum

$\displaystyle \int_{{\cal E}} Au(t,x)\ dx$

and total angular momentum

$\displaystyle \int_{{\cal E}} x \wedge Au(t,x)\ dx,$

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

$\displaystyle \int_{\Phi(\gamma)} (Au)^*$

or pointwise conservation of the Lagrangian vorticity ${\Phi^* \theta := \Phi^* d(Au)^*}$. These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions ${n=2}$, when ${A}$ is the inverse derivative ${A = |\nabla|^{-1} = (-\Delta)^{-1/2}}$. The vorticity ${\theta = d(Au)^*}$ is a ${2}$-form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write ${u = (u_1,u_2)}$, then

$\displaystyle \theta = \partial_{x_1} |\nabla|^{-1} u_2 - \partial_{x_2} |\nabla|^{-1} u_1.$

Since ${u}$ is also divergence-free, we may therefore write

$\displaystyle u = (- \partial_{x_2} \psi, \partial_{x_1} \psi )$

where the stream function ${\psi}$ is given by the formula

$\displaystyle \psi = |\nabla|^{-1} \theta.$

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

$\displaystyle [\partial_t + u \cdot \nabla] \theta = 0 \ \ \ \ \ (3)$

$\displaystyle u = (-\partial_{x_2} |\nabla|^{-1} \theta, \partial_{x_1} |\nabla|^{-1} \theta).$

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

$\displaystyle \frac{1}{2} \int_{{\bf R}^2} u\cdot |\nabla|^{-1} u\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta \psi\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta |\nabla|^{-1} \theta\ dx$

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As ${A}$ commutes with rigid motions, we also have (formally, at least) conservation of momentum

$\displaystyle \int_{{\bf R}^2} Au\ dx$

(which up to trivial transformations is also expressible in impulse form as ${\int_{{\bf R}^2} \theta x\ dx}$, after integration by parts), and conservation of angular momentum

$\displaystyle \int_{{\bf R}^2} x \wedge Au\ dx$

(which up to trivial transformations is ${\int_{{\bf R}^2} \theta |x|^2\ dx}$). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity ${\Phi^* \theta}$, thus ${\theta}$ is transported by the flow (which is also evident from (3). In particular, all integrals of the form ${\int F(\theta)\ dx}$ for a fixed function ${F}$ are conserved by the flow.

Throughout this post, we will work only at the formal level of analysis, ignoring issues of convergence of integrals, justifying differentiation under the integral sign, and so forth. (Rigorous justification of the conservation laws and other identities arising from the formal manipulations below can usually be established in an a posteriori fashion once the identities are in hand, without the need to rigorously justify the manipulations used to come up with these identities).

It is a remarkable fact in the theory of differential equations that many of the ordinary and partial differential equations that are of interest (particularly in geometric PDE, or PDE arising from mathematical physics) admit a variational formulation; thus, a collection ${\Phi: \Omega \rightarrow M}$ of one or more fields on a domain ${\Omega}$ taking values in a space ${M}$ will solve the differential equation of interest if and only if ${\Phi}$ is a critical point to the functional

$\displaystyle J[\Phi] := \int_\Omega L( x, \Phi(x), D\Phi(x) )\ dx \ \ \ \ \ (1)$

involving the fields ${\Phi}$ and their first derivatives ${D\Phi}$, where the Lagrangian ${L: \Sigma \rightarrow {\bf R}}$ is a function on the vector bundle ${\Sigma}$ over ${\Omega \times M}$ consisting of triples ${(x, q, \dot q)}$ with ${x \in \Omega}$, ${q \in M}$, and ${\dot q: T_x \Omega \rightarrow T_q M}$ a linear transformation; we also usually keep the boundary data of ${\Phi}$ fixed in case ${\Omega}$ has a non-trivial boundary, although we will ignore these issues here. (We also ignore the possibility of having additional constraints imposed on ${\Phi}$ and ${D\Phi}$, which require the machinery of Lagrange multipliers to deal with, but which will only serve as a distraction for the current discussion.) It is common to use local coordinates to parameterise ${\Omega}$ as ${{\bf R}^d}$ and ${M}$ as ${{\bf R}^n}$, in which case ${\Sigma}$ can be viewed locally as a function on ${{\bf R}^d \times {\bf R}^n \times {\bf R}^{dn}}$.

Example 1 (Geodesic flow) Take ${\Omega = [0,1]}$ and ${M = (M,g)}$ to be a Riemannian manifold, which we will write locally in coordinates as ${{\bf R}^n}$ with metric ${g_{ij}(q)}$ for ${i,j=1,\dots,n}$. A geodesic ${\gamma: [0,1] \rightarrow M}$ is then a critical point (keeping ${\gamma(0),\gamma(1)}$ fixed) of the energy functional

$\displaystyle J[\gamma] := \frac{1}{2} \int_0^1 g_{\gamma(t)}( D\gamma(t), D\gamma(t) )\ dt$

or in coordinates (ignoring coordinate patch issues, and using the usual summation conventions)

$\displaystyle J[\gamma] = \frac{1}{2} \int_0^1 g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t)\ dt.$

As discussed in this previous post, both the Euler equations for rigid body motion, and the Euler equations for incompressible inviscid flow, can be interpreted as geodesic flow (though in the latter case, one has to work really formally, as the manifold ${M}$ is now infinite dimensional).

More generally, if ${\Omega = (\Omega,h)}$ is itself a Riemannian manifold, which we write locally in coordinates as ${{\bf R}^d}$ with metric ${h_{ab}(x)}$ for ${a,b=1,\dots,d}$, then a harmonic map ${\Phi: \Omega \rightarrow M}$ is a critical point of the energy functional

$\displaystyle J[\Phi] := \frac{1}{2} \int_\Omega h(x) \otimes g_{\gamma(x)}( D\gamma(x), D\gamma(x) )\ dh(x)$

or in coordinates (again ignoring coordinate patch issues)

$\displaystyle J[\Phi] = \frac{1}{2} \int_{{\bf R}^d} h_{ab}(x) g_{ij}(\Phi(x)) (\partial_a \Phi^i(x)) (\partial_b \Phi^j(x))\ \sqrt{\det(h(x))}\ dx.$

If we replace the Riemannian manifold ${\Omega}$ by a Lorentzian manifold, such as Minkowski space ${{\bf R}^{1+3}}$, then the notion of a harmonic map is replaced by that of a wave map, which generalises the scalar wave equation (which corresponds to the case ${M={\bf R}}$).

Example 2 (${N}$-particle interactions) Take ${\Omega = {\bf R}}$ and ${M = {\bf R}^3 \otimes {\bf R}^N}$; then a function ${\Phi: \Omega \rightarrow M}$ can be interpreted as a collection of ${N}$ trajectories ${q_1,\dots,q_N: {\bf R} \rightarrow {\bf R}^3}$ in space, which we give a physical interpretation as the trajectories of ${N}$ particles. If we assign each particle a positive mass ${m_1,\dots,m_N > 0}$, and also introduce a potential energy function ${V: M \rightarrow {\bf R}}$, then it turns out that Newton’s laws of motion ${F=ma}$ in this context (with the force ${F_i}$ on the ${i^{th}}$ particle being given by the conservative force ${-\nabla_{q_i} V}$) are equivalent to the trajectories ${q_1,\dots,q_N}$ being a critical point of the action functional

$\displaystyle J[\Phi] := \int_{\bf R} \sum_{i=1}^N \frac{1}{2} m_i |\dot q_i(t)|^2 - V( q_1(t),\dots,q_N(t) )\ dt.$

Formally, if ${\Phi = \Phi_0}$ is a critical point of a functional ${J[\Phi]}$, this means that

$\displaystyle \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0$

whenever ${s \mapsto \Phi[s]}$ is a (smooth) deformation with ${\Phi[0]=\Phi_0}$ (and with ${\Phi[s]}$ respecting whatever boundary conditions are appropriate). Interchanging the derivative and integral, we (formally, at least) arrive at

$\displaystyle \int_\Omega \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}\ dx = 0. \ \ \ \ \ (2)$

Write ${\delta \Phi := \frac{d}{ds} \Phi[s]|_{s=0}}$ for the infinitesimal deformation of ${\Phi_0}$. By the chain rule, ${\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}}$ can be expressed in terms of ${x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}$. In coordinates, we have

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \delta \Phi^i(x) L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) \ \ \ \ \ (3)$

$\displaystyle + \partial_{x^a} \delta \Phi^i(x) L_{\partial_{x^a} q^i} (x,\Phi_0(x), D\Phi_0(x)),$

where we parameterise ${\Sigma}$ by ${x, (q^i)_{i=1,\dots,n}, (\partial_{x^a} q^i)_{a=1,\dots,d; i=1,\dots,n}}$, and we use subscripts on ${L}$ to denote partial derivatives in the various coefficients. (One can of course work in a coordinate-free manner here if one really wants to, but the notation becomes a little cumbersome due to the need to carefully split up the tangent space of ${\Sigma}$, and we will not do so here.) Thus we can view (2) as an integral identity that asserts the vanishing of a certain integral, whose integrand involves ${x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}$, where ${\delta \Phi}$ vanishes at the boundary but is otherwise unconstrained.

A general rule of thumb in PDE and calculus of variations is that whenever one has an integral identity of the form ${\int_\Omega F(x)\ dx = 0}$ for some class of functions ${F}$ that vanishes on the boundary, then there must be an associated differential identity ${F = \hbox{div} X}$ that justifies this integral identity through Stokes’ theorem. This rule of thumb helps explain why integration by parts is used so frequently in PDE to justify integral identities. The rule of thumb can fail when one is dealing with “global” or “cohomologically non-trivial” integral identities of a topological nature, such as the Gauss-Bonnet or Kazhdan-Warner identities, but is quite reliable for “local” or “cohomologically trivial” identities, such as those arising from calculus of variations.

In any case, if we apply this rule to (2), we expect that the integrand ${\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}}$ should be expressible as a spatial divergence. This is indeed the case:

Proposition 1 (Formal) Let ${\Phi = \Phi_0}$ be a critical point of the functional ${J[\Phi]}$ defined in (1). Then for any deformation ${s \mapsto \Phi[s]}$ with ${\Phi[0] = \Phi_0}$, we have

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} X \ \ \ \ \ (4)$

where ${X}$ is the vector field that is expressible in coordinates as

$\displaystyle X^a := \delta \Phi^i(x) L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)). \ \ \ \ \ (5)$

Proof: Comparing (4) with (3), we see that the claim is equivalent to the Euler-Lagrange equation

$\displaystyle L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) = 0. \ \ \ \ \ (6)$

The same computation, together with an integration by parts, shows that (2) may be rewritten as

$\displaystyle \int_\Omega ( L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) ) \delta \Phi^i(x)\ dx = 0.$

Since ${\delta \Phi^i(x)}$ is unconstrained on the interior of ${\Omega}$, the claim (6) follows (at a formal level, at least). $\Box$

Many variational problems also enjoy one-parameter continuous symmetries: given any field ${\Phi_0}$ (not necessarily a critical point), one can place that field in a one-parameter family ${s \mapsto \Phi[s]}$ with ${\Phi[0] = \Phi_0}$, such that

$\displaystyle J[ \Phi[s] ] = J[ \Phi[0] ]$

for all ${s}$; in particular,

$\displaystyle \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0,$

which can be written as (2) as before. Applying the previous rule of thumb, we thus expect another divergence identity

$\displaystyle \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} Y \ \ \ \ \ (7)$

whenever ${s \mapsto \Phi[s]}$ arises from a continuous one-parameter symmetry. This expectation is indeed the case in many examples. For instance, if the spatial domain ${\Omega}$ is the Euclidean space ${{\bf R}^d}$, and the Lagrangian (when expressed in coordinates) has no direct dependence on the spatial variable ${x}$, thus

$\displaystyle L( x, \Phi(x), D\Phi(x) ) = L( \Phi(x), D\Phi(x) ), \ \ \ \ \ (8)$

then we obtain ${d}$ translation symmetries

$\displaystyle \Phi[s](x) := \Phi(x - s e^a )$

for ${a=1,\dots,d}$, where ${e^1,\dots,e^d}$ is the standard basis for ${{\bf R}^d}$. For a fixed ${a}$, the left-hand side of (7) then becomes

$\displaystyle \frac{d}{ds} L( \Phi(x-se^a), D\Phi(x-se^a) )|_{s=0} = -\partial_{x^a} [ L( \Phi(x), D\Phi(x) ) ]$

$\displaystyle = \hbox{div} Y$

where ${Y(x) = - L(\Phi(x), D\Phi(x)) e^a}$. Another common type of symmetry is a pointwise symmetry, in which

$\displaystyle L( x, \Phi[s](x), D\Phi[s](x) ) = L( x, \Phi[0](x), D\Phi[0](x) ) \ \ \ \ \ (9)$

for all ${x}$, in which case (7) clearly holds with ${Y=0}$.

If we subtract (4) from (7), we obtain the celebrated theorem of Noether linking symmetries with conservation laws:

Theorem 2 (Noether’s theorem) Suppose that ${\Phi_0}$ is a critical point of the functional (1), and let ${\Phi[s]}$ be a one-parameter continuous symmetry with ${\Phi[0] = \Phi_0}$. Let ${X}$ be the vector field in (5), and let ${Y}$ be the vector field in (7). Then we have the pointwise conservation law

$\displaystyle \hbox{div}(X-Y) = 0.$

In particular, for one-dimensional variational problems, in which ${\Omega \subset {\bf R}}$, we have the conservation law ${(X-Y)(t) = (X-Y)(0)}$ for all ${t \in \Omega}$ (assuming of course that ${\Omega}$ is connected and contains ${0}$).

Noether’s theorem gives a systematic way to locate conservation laws for solutions to variational problems. For instance, if ${\Omega \subset {\bf R}}$ and the Lagrangian has no explicit time dependence, thus

$\displaystyle L(t, \Phi(t), \dot \Phi(t)) = L(\Phi(t), \dot \Phi(t)),$

then by using the time translation symmetry ${\Phi[s](t) := \Phi(t-s)}$, we have

$\displaystyle Y(t) = - L( \Phi(t), \dot\Phi(t) )$

as discussed previously, whereas we have ${\delta \Phi(t) = - \dot \Phi(t)}$, and hence by (5)

$\displaystyle X(t) := - \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t)),$

and so Noether’s theorem gives conservation of the Hamiltonian

$\displaystyle H(t) := \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t))- L(\Phi(t), \dot \Phi(t)). \ \ \ \ \ (10)$

For instance, for geodesic flow, the Hamiltonian works out to be

$\displaystyle H(t) = \frac{1}{2} g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t),$

so we see that the speed of the geodesic is conserved over time.

For pointwise symmetries (9), ${Y}$ vanishes, and so Noether’s theorem simplifies to ${\hbox{div} X = 0}$; in the one-dimensional case ${\Omega \subset {\bf R}}$, we thus see from (5) that the quantity

$\displaystyle \delta \Phi^i(t) L_{\dot q^i}(t,\Phi_0(t), \dot \Phi_0(t)) \ \ \ \ \ (11)$

is conserved in time. For instance, for the ${N}$-particle system in Example 2, if we have the translation invariance

$\displaystyle V( q_1 + h, \dots, q_N + h ) = V( q_1, \dots, q_N )$

for all ${q_1,\dots,q_N,h \in {\bf R}^3}$, then we have the pointwise translation symmetry

$\displaystyle q_i[s](t) := q_i(t) + s e^j$

for all ${i=1,\dots,N}$, ${s \in{\bf R}}$ and some ${j=1,\dots,3}$, in which case ${\dot q_i(t) = e^j}$, and the conserved quantity (11) becomes

$\displaystyle \sum_{i=1}^n m_i \dot q_i^j(t);$

as ${j=1,\dots,3}$ was arbitrary, this establishes conservation of the total momentum

$\displaystyle \sum_{i=1}^n m_i \dot q_i(t).$

Similarly, if we have the rotation invariance

$\displaystyle V( R q_1, \dots, Rq_N ) = V( q_1, \dots, q_N )$

for any ${q_1,\dots,q_N \in {\bf R}^3}$ and ${R \in SO(3)}$, then we have the pointwise rotation symmetry

$\displaystyle q_i[s](t) := \exp( s A ) q_i(t)$

for any skew-symmetric real ${3 \times 3}$ matrix ${A}$, in which case ${\dot q_i(t) = A q_i(t)}$, and the conserved quantity (11) becomes

$\displaystyle \sum_{i=1}^n m_i \langle A q_i(t), \dot q_i(t) \rangle;$

since ${A}$ is an arbitrary skew-symmetric matrix, this establishes conservation of the total angular momentum

$\displaystyle \sum_{i=1}^n m_i q_i(t) \wedge \dot q_i(t).$

Below the fold, I will describe how Noether’s theorem can be used to locate all of the conserved quantities for the Euler equations of inviscid fluid flow, discussed in this previous post, by interpreting that flow as geodesic flow in an infinite dimensional manifold.

The Euler equations for incompressible inviscid fluids may be written as

$\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p$

$\displaystyle \nabla \cdot u = 0$

where ${u: [0,T] \times {\bf R}^n \rightarrow {\bf R}^n}$ is the velocity field, and ${p: [0,T] \times {\bf R}^n \rightarrow {\bf R}}$ is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that ${u}$ is bounded. We will take the dimension ${n}$ to be at least two, with the three-dimensional case ${n=3}$ being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when ${n=3}$ is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the conservation laws for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

$\displaystyle \int_{{\bf R}^n} |u|^2\ dx \ \ \ \ \ (1)$

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

$\displaystyle \int_{{\bf R}^3} u \cdot (\nabla \times u)\ dx \ \ \ \ \ (2)$

and (formally, at least) one has conservation of momentum

$\displaystyle \int_{{\bf R}^3} u\ dx$

and angular momentum

$\displaystyle \int_{{\bf R}^3} x \times u\ dx$

(although, as we shall discuss below, due to the slow decay of ${u}$ at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

$\displaystyle \int_{{\bf R}^3} \nabla \times u\ dx$

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

I’ve just uploaded to the arXiv the paper “Finite time blowup for an averaged three-dimensional Navier-Stokes equation“, submitted to J. Amer. Math. Soc.. The main purpose of this paper is to formalise the “supercriticality barrier” for the global regularity problem for the Navier-Stokes equation, which roughly speaking asserts that it is not possible to establish global regularity by any “abstract” approach which only uses upper bound function space estimates on the nonlinear part of the equation, combined with the energy identity. This is done by constructing a modification of the Navier-Stokes equations with a nonlinearity that obeys essentially all of the function space estimates that the true Navier-Stokes nonlinearity does, and which also obeys the energy identity, but for which one can construct solutions that blow up in finite time. Results of this type had been previously established by Montgomery-Smith, Gallagher-Paicu, and Li-Sinai for variants of the Navier-Stokes equation without the energy identity, and by Katz-Pavlovic and by Cheskidov for dyadic analogues of the Navier-Stokes equations in five and higher dimensions that obeyed the energy identity (see also the work of Plechac and Sverak and of Hou and Lei that also suggest blowup for other Navier-Stokes type models obeying the energy identity in five and higher dimensions), but to my knowledge this is the first blowup result for a Navier-Stokes type equation in three dimensions that also obeys the energy identity. Intriguingly, the method of proof in fact hints at a possible route to establishing blowup for the true Navier-Stokes equations, which I am now increasingly inclined to believe is the case (albeit for a very small set of initial data).

To state the results more precisely, recall that the Navier-Stokes equations can be written in the form

$\displaystyle \partial_t u + (u \cdot \nabla) u = \nu \Delta u + \nabla p$

for a divergence-free velocity field ${u}$ and a pressure field ${p}$, where ${\nu>0}$ is the viscosity, which we will normalise to be one. We will work in the non-periodic setting, so the spatial domain is ${{\bf R}^3}$, and for sake of exposition I will not discuss matters of regularity or decay of the solution (but we will always be working with strong notions of solution here rather than weak ones). Applying the Leray projection ${P}$ to divergence-free vector fields to this equation, we can eliminate the pressure, and obtain an evolution equation

$\displaystyle \partial_t u = \Delta u + B(u,u) \ \ \ \ \ (1)$

purely for the velocity field, where ${B}$ is a certain bilinear operator on divergence-free vector fields (specifically, ${B(u,v) = -\frac{1}{2} P( (u \cdot \nabla) v + (v \cdot \nabla) u)}$. The global regularity problem for Navier-Stokes is then equivalent to the global regularity problem for the evolution equation (1).

An important feature of the bilinear operator ${B}$ appearing in (1) is the cancellation law

$\displaystyle \langle B(u,u), u \rangle = 0$

(using the ${L^2}$ inner product on divergence-free vector fields), which leads in particular to the fundamental energy identity

$\displaystyle \frac{1}{2} \int_{{\bf R}^3} |u(T,x)|^2\ dx + \int_0^T \int_{{\bf R}^3} |\nabla u(t,x)|^2\ dx dt = \frac{1}{2} \int_{{\bf R}^3} |u(0,x)|^2\ dx.$

This identity (and its consequences) provide essentially the only known a priori bound on solutions to the Navier-Stokes equations from large data and arbitrary times. Unfortunately, as discussed in this previous post, the quantities controlled by the energy identity are supercritical with respect to scaling, which is the fundamental obstacle that has defeated all attempts to solve the global regularity problem for Navier-Stokes without any additional assumptions on the data or solution (e.g. perturbative hypotheses, or a priori control on a critical norm such as the ${L^\infty_t L^3_x}$ norm).

Our main result is then (slightly informally stated) as follows

Theorem 1 There exists an averaged version ${\tilde B}$ of the bilinear operator ${B}$, of the form

$\displaystyle \tilde B(u,v) := \int_\Omega m_{3,\omega}(D) Rot_{3,\omega}$

$\displaystyle B( m_{1,\omega}(D) Rot_{1,\omega} u, m_{2,\omega}(D) Rot_{2,\omega} v )\ d\mu(\omega)$

for some probability space ${(\Omega, \mu)}$, some spatial rotation operators ${Rot_{i,\omega}}$ for ${i=1,2,3}$, and some Fourier multipliers ${m_{i,\omega}}$ of order ${0}$, for which one still has the cancellation law

$\displaystyle \langle \tilde B(u,u), u \rangle = 0$

and for which the averaged Navier-Stokes equation

$\displaystyle \partial_t u = \Delta u + \tilde B(u,u) \ \ \ \ \ (2)$

admits solutions that blow up in finite time.

(There are some integrability conditions on the Fourier multipliers ${m_{i,\omega}}$ required in the above theorem in order for the conclusion to be non-trivial, but I am omitting them here for sake of exposition.)

Because spatial rotations and Fourier multipliers of order ${0}$ are bounded on most function spaces, ${\tilde B}$ automatically obeys almost all of the upper bound estimates that ${B}$ does. Thus, this theorem blocks any attempt to prove global regularity for the true Navier-Stokes equations which relies purely on the energy identity and on upper bound estimates for the nonlinearity; one must use some additional structure of the nonlinear operator ${B}$ which is not shared by an averaged version ${\tilde B}$. Such additional structure certainly exists – for instance, the Navier-Stokes equation has a vorticity formulation involving only differential operators rather than pseudodifferential ones, whereas a general equation of the form (2) does not. However, “abstract” approaches to global regularity generally do not exploit such structure, and thus cannot be used to affirmatively answer the Navier-Stokes problem.

It turns out that the particular averaged bilinear operator ${B}$ that we will use will be a finite linear combination of local cascade operators, which take the form

$\displaystyle C(u,v) := \sum_{n \in {\bf Z}} (1+\epsilon_0)^{5n/2} \langle u, \psi_{1,n} \rangle \langle v, \psi_{2,n} \rangle \psi_{3,n}$

where ${\epsilon_0>0}$ is a small parameter, ${\psi_1,\psi_2,\psi_3}$ are Schwartz vector fields whose Fourier transform is supported on an annulus, and ${\psi_{i,n}(x) := (1+\epsilon_0)^{3n/2} \psi_i( (1+\epsilon_0)^n x)}$ is an ${L^2}$-rescaled version of ${\psi_i}$ (basically a “wavelet” of wavelength about ${(1+\epsilon_0)^{-n}}$ centred at the origin). Such operators were essentially introduced by Katz and Pavlovic as dyadic models for ${B}$; they have the essentially the same scaling property as ${B}$ (except that one can only scale along powers of ${1+\epsilon_0}$, rather than over all positive reals), and in fact they can be expressed as an average of ${B}$ in the sense of the above theorem, as can be shown after a somewhat tedious amount of Fourier-analytic symbol manipulations.

If we consider nonlinearities ${\tilde B}$ which are a finite linear combination of local cascade operators, then the equation (2) more or less collapses to a system of ODE in certain “wavelet coefficients” of ${u}$. The precise ODE that shows up depends on what precise combination of local cascade operators one is using. Katz and Pavlovic essentially considered a single cascade operator together with its “adjoint” (needed to preserve the energy identity), and arrived (more or less) at the system of ODE

$\displaystyle \partial_t X_n = - (1+\epsilon_0)^{2n} X_n + (1+\epsilon_0)^{\frac{5}{2}(n-1)} X_{n-1}^2 - (1+\epsilon_0)^{\frac{5}{2} n} X_n X_{n+1} \ \ \ \ \ (3)$

where ${X_n: [0,T] \rightarrow {\bf R}}$ are scalar fields for each integer ${n}$. (Actually, Katz-Pavlovic worked with a technical variant of this particular equation, but the differences are not so important for this current discussion.) Note that the quadratic terms on the RHS carry a higher exponent of ${1+\epsilon_0}$ than the dissipation term; this reflects the supercritical nature of this evolution (the energy ${\frac{1}{2} \sum_n X_n^2}$ is monotone decreasing in this flow, so the natural size of ${X_n}$ given the control on the energy is ${O(1)}$). There is a slight technical issue with the dissipation if one wishes to embed (3) into an equation of the form (2), but it is minor and I will not discuss it further here.

In principle, if the ${X_n}$ mode has size comparable to ${1}$ at some time ${t_n}$, then energy should flow from ${X_n}$ to ${X_{n+1}}$ at a rate comparable to ${(1+\epsilon_0)^{\frac{5}{2} n}}$, so that by time ${t_{n+1} \approx t_n + (1+\epsilon_0)^{-\frac{5}{2} n}}$ or so, most of the energy of ${X_n}$ should have drained into the ${X_{n+1}}$ mode (with hardly any energy dissipated). Since the series ${\sum_{n \geq 1} (1+\epsilon_0)^{-\frac{5}{2} n}}$ is summable, this suggests finite time blowup for this ODE as the energy races ever more quickly to higher and higher modes. Such a scenario was indeed established by Katz and Pavlovic (and refined by Cheskidov) if the dissipation strength ${(1+\epsilon)^{2n}}$ was weakened somewhat (the exponent ${2}$ has to be lowered to be less than ${\frac{5}{3}}$). As mentioned above, this is enough to give a version of Theorem 1 in five and higher dimensions.

On the other hand, it was shown a few years ago by Barbato, Morandin, and Romito that (3) in fact admits global smooth solutions (at least in the dyadic case ${\epsilon_0=1}$, and assuming non-negative initial data). Roughly speaking, the problem is that as energy is being transferred from ${X_n}$ to ${X_{n+1}}$, energy is also simultaneously being transferred from ${X_{n+1}}$ to ${X_{n+2}}$, and as such the solution races off to higher modes a bit too prematurely, without absorbing all of the energy from lower modes. This weakens the strength of the blowup to the point where the moderately strong dissipation in (3) is enough to kill the high frequency cascade before a true singularity occurs. Because of this, the original Katz-Pavlovic model cannot quite be used to establish Theorem 1 in three dimensions. (Actually, the original Katz-Pavlovic model had some additional dispersive features which allowed for another proof of global smooth solutions, which is an unpublished result of Nazarov.)

To get around this, I had to “engineer” an ODE system with similar features to (3) (namely, a quadratic nonlinearity, a monotone total energy, and the indicated exponents of ${(1+\epsilon_0)}$ for both the dissipation term and the quadratic terms), but for which the cascade of energy from scale ${n}$ to scale ${n+1}$ was not interrupted by the cascade of energy from scale ${n+1}$ to scale ${n+2}$. To do this, I needed to insert a delay in the cascade process (so that after energy was dumped into scale ${n}$, it would take some time before the energy would start to transfer to scale ${n+1}$), but the process also needed to be abrupt (once the process of energy transfer started, it needed to conclude very quickly, before the delayed transfer for the next scale kicked in). It turned out that one could build a “quadratic circuit” out of some basic “quadratic gates” (analogous to how an electrical circuit could be built out of basic gates such as amplifiers or resistors) that achieved this task, leading to an ODE system essentially of the form

$\displaystyle \partial_t X_{1,n} = - (1+\epsilon_0)^{2n} X_{1,n}$

$\displaystyle + (1+\epsilon_0)^{5n/2} (- \epsilon^{-2} X_{3,n} X_{4,n} - \epsilon X_{1,n} X_{2,n} - \epsilon^2 \exp(-K^{10}) X_{1,n} X_{3,n}$

$\displaystyle + K X_{4,n-1}^2)$

$\displaystyle \partial_t X_{2,n} = - (1+\epsilon_0)^{2n} X_{2,n} + (1+\epsilon_0)^{5n/2} (\epsilon X_{1,n}^2 - \epsilon^{-1} K^{10} X_{3,n}^2)$

$\displaystyle \partial_t X_{3,n} = - (1+\epsilon_0)^{2n} X_{3,n} + (1+\epsilon_0)^{5n/2} (\epsilon^2 \exp(-K^{10}) X_{1,n}^2$

$\displaystyle + \epsilon^{-1} K^{10} X_{2,n} X_{3,n} )$

$\displaystyle \partial_t X_{4,n} =- (1+\epsilon_0)^{2n} X_{4,n} + (1+\epsilon_0)^{5n/2} (\epsilon^{-2} X_{3,n} X_{1,n}$

$\displaystyle - (1+\epsilon_0)^{5/2} K X_{4,n} X_{1,n+1})$

where ${K \geq 1}$ is a suitable large parameter and ${\epsilon > 0}$ is a suitable small parameter (much smaller than ${1/K}$). To visualise the dynamics of such a system, I found it useful to describe this system graphically by a “circuit diagram” that is analogous (but not identical) to the circuit diagrams arising in electrical engineering:

The coupling constants here range widely from being very large to very small; in practice, this makes the ${X_{2,n}}$ and ${X_{3,n}}$ modes absorb very little energy, but exert a sizeable influence on the remaining modes. If a lot of energy is suddenly dumped into ${X_{1,n}}$, what happens next is roughly as follows: for a moderate period of time, nothing much happens other than a trickle of energy into ${X_{2,n}}$, which in turn causes a rapid exponential growth of ${X_{3,n}}$ (from a very low base). After this delay, ${X_{3,n}}$ suddenly crosses a certain threshold, at which point it causes ${X_{1,n}}$ and ${X_{4,n}}$ to exchange energy back and forth with extreme speed. The energy from ${X_{4,n}}$ then rapidly drains into ${X_{1,n+1}}$, and the process begins again (with a slight loss in energy due to the dissipation). If one plots the total energy ${E_n := \frac{1}{2} ( X_{1,n}^2 + X_{2,n}^2 + X_{3,n}^2 + X_{4,n}^2 )}$ as a function of time, it looks schematically like this:

As in the previous heuristic discussion, the time between cascades from one frequency scale to the next decay exponentially, leading to blowup at some finite time ${T}$. (One could describe the dynamics here as being similar to the famous “lighting the beacons” scene in the Lord of the Rings movies, except that (a) as each beacon gets ignited, the previous one is extinguished, as per the energy identity; (b) the time between beacon lightings decrease exponentially; and (c) there is no soundtrack.)

There is a real (but remote) possibility that this sort of construction can be adapted to the true Navier-Stokes equations. The basic blowup mechanism in the averaged equation is that of a von Neumann machine, or more precisely a construct (built within the laws of the inviscid evolution ${\partial_t u = \tilde B(u,u)}$) that, after some time delay, manages to suddenly create a replica of itself at a finer scale (and to largely erase its original instantiation in the process). In principle, such a von Neumann machine could also be built out of the laws of the inviscid form of the Navier-Stokes equations (i.e. the Euler equations). In physical terms, one would have to build the machine purely out of an ideal fluid (i.e. an inviscid incompressible fluid). If one could somehow create enough “logic gates” out of ideal fluid, one could presumably build a sort of “fluid computer”, at which point the task of building a von Neumann machine appears to reduce to a software engineering exercise rather than a PDE problem (providing that the gates are suitably stable with respect to perturbations, but (as with actual computers) this can presumably be done by converting the analog signals of fluid mechanics into a more error-resistant digital form). The key thing missing in this program (in both senses of the word) to establish blowup for Navier-Stokes is to construct the logic gates within the laws of ideal fluids. (Compare with the situation for cellular automata such as Conway’s “Game of Life“, in which Turing complete computers, universal constructors, and replicators have all been built within the laws of that game.)

The purpose of this post is to link to a short unpublished note of mine that I wrote back in 2010 but forgot to put on my web page at the time. Entitled “A physical space proof of the bilinear Strichartz and local smoothing estimates for the Schrodinger equation“, it gives a proof of two standard estimates for the free (linear) Schrodinger equation in flat Euclidean space, namely the bilinear Strichartz estimate and the local smoothing estimate, using primarily “physical space” methods such as integration by parts, instead of “frequency space” methods based on the Fourier transform, although a small amount of Fourier analysis (basically sectoral projection to make the Schrodinger waves move roughly in a given direction) is still needed.  This is somewhat in the spirit of an older paper of mine with Klainerman and Rodnianski doing something similar for the wave equation, and is also very similar to a paper of Planchon and Vega from 2009.  The hope was that by avoiding the finer properties of the Fourier transform, one could obtain a more robust argument which could also extend to nonlinear, non-free, or non-flat situations.   These notes were cited once or twice by some people that I had privately circulated them to, so I decided to put them online here for reference.

UPDATE, July 24: Fabrice Planchon has kindly supplied another note in which he gives a particularly simple proof of local smoothing in one dimension, and discusses some other variants of the method (related to the paper of Planchon and Vega cited earlier).

Consider the free Schrödinger equation in ${d}$ spatial dimensions, which I will normalise as

$\displaystyle i u_t + \frac{1}{2} \Delta_{{\bf R}^d} u = 0 \ \ \ \ \ (1)$

where ${u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ is the unknown field and ${\Delta_{{\bf R}^{d+1}} = \sum_{j=1}^d \frac{\partial^2}{\partial x_j^2}}$ is the spatial Laplacian. To avoid irrelevant technical issues I will restrict attention to smooth (classical) solutions to this equation, and will work locally in spacetime avoiding issues of decay at infinity (or at other singularities); I will also avoid issues involving branch cuts of functions such as ${t^{d/2}}$ (if one wishes, one can restrict ${d}$ to be even in order to safely ignore all branch cut issues). The space of solutions to (1) enjoys a number of symmetries. A particularly non-obvious symmetry is the pseudoconformal symmetry: if ${u}$ solves (1), then the pseudoconformal solution ${pc(u): {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ defined by

$\displaystyle pc(u)(t,x) := \frac{1}{(it)^{d/2}} \overline{u(\frac{1}{t}, \frac{x}{t})} e^{i|x|^2/2t} \ \ \ \ \ (2)$

for ${t \neq 0}$ can be seen after some computation to also solve (1). (If ${u}$ has suitable decay at spatial infinity and one chooses a suitable branch cut for ${(it)^{d/2}}$, one can extend ${pc(u)}$ continuously to the ${t=0}$ spatial slice, whereupon it becomes essentially the spatial Fourier transform of ${u(0,\cdot)}$, but we will not need this fact for the current discussion.)

An analogous symmetry exists for the free wave equation in ${d+1}$ spatial dimensions, which I will write as

$\displaystyle u_{tt} - \Delta_{{\bf R}^{d+1}} u = 0 \ \ \ \ \ (3)$

where ${u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ is the unknown field. In analogy to pseudoconformal symmetry, we have conformal symmetry: if ${u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ solves (3), then the function ${conf(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$, defined in the interior ${\{ (t,x): |x| < |t| \}}$ of the light cone by the formula

$\displaystyle conf(u)(t,x) := (t^2-|x|^2)^{-d/2} u( \frac{t}{t^2-|x|^2}, \frac{x}{t^2-|x|^2} ), \ \ \ \ \ (4)$

also solves (3).

There are also some direct links between the Schrödinger equation in ${d}$ dimensions and the wave equation in ${d+1}$ dimensions. This can be easily seen on the spacetime Fourier side: solutions to (1) have spacetime Fourier transform (formally) supported on a ${d}$-dimensional hyperboloid, while solutions to (3) have spacetime Fourier transform formally supported on a ${d+1}$-dimensional cone. To link the two, one then observes that the ${d}$-dimensional hyperboloid can be viewed as a conic section (i.e. hyperplane slice) of the ${d+1}$-dimensional cone. In physical space, this link is manifested as follows: if ${u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ solves (1), then the function ${\iota_{1}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ defined by

$\displaystyle \iota_{1}(u)(t,x_1,\ldots,x_{d+1}) := e^{-i(t+x_{d+1})} u( \frac{t-x_{d+1}}{2}, x_1,\ldots,x_d)$

solves (3). More generally, for any non-zero scaling parameter ${\lambda}$, the function ${\iota_{\lambda}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ defined by

$\displaystyle \iota_{\lambda}(u)(t,x_1,\ldots,x_{d+1}) :=$

$\displaystyle \lambda^{d/2} e^{-i\lambda(t+x_{d+1})} u( \lambda \frac{t-x_{d+1}}{2}, \lambda x_1,\ldots,\lambda x_d) \ \ \ \ \ (5)$

solves (3).

As an “extra challenge” posed in an exercise in one of my books (Exercise 2.28, to be precise), I asked the reader to use the embeddings ${\iota_1}$ (or more generally ${\iota_\lambda}$) to explicitly connect together the pseudoconformal transformation ${pc}$ and the conformal transformation ${conf}$. It turns out that this connection is a little bit unusual, with the “obvious” guess (namely, that the embeddings ${\iota_\lambda}$ intertwine ${pc}$ and ${conf}$) being incorrect, and as such this particular task was perhaps too difficult even for a challenge question. I’ve been asked a couple times to provide the connection more explicitly, so I will do so below the fold.