You are currently browsing the tag archive for the ‘wave equation’ tag.

The Poincaré upper half-plane ${{\mathbf H} := \{ z: \hbox{Im}(z) > 0 \}}$ (with a boundary consisting of the real line ${{\bf R}}$ together with the point at infinity ${\infty}$) carries an action of the projective special linear group

$\displaystyle \hbox{PSL}_2({\bf R}) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix}: a,b,c,d \in {\bf R}: ad-bc = 1 \} / \{\pm 1\}$

via fractional linear transformations:

$\displaystyle \begin{pmatrix} a & b \\ c & d \end{pmatrix} z := \frac{az+b}{cz+d}. \ \ \ \ \ (1)$

Here and in the rest of the post we will abuse notation by identifying elements ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ of the special linear group ${\hbox{SL}_2({\bf R})}$ with their equivalence class ${\{ \pm \begin{pmatrix} a & b \\ c & d \end{pmatrix} \}}$ in ${\hbox{PSL}_2({\bf R})}$; this will occasionally create or remove a factor of two in our formulae, but otherwise has very little effect, though one has to check that various definitions and expressions (such as (1)) are unaffected if one replaces a matrix ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ by its negation ${\begin{pmatrix} -a & -b \\ -c & -d \end{pmatrix}}$. In particular, we recommend that the reader ignore the signs ${\pm}$ that appear from time to time in the discussion below.

As the action of ${\hbox{PSL}_2({\bf R})}$ on ${{\mathbf H}}$ is transitive, and any given point in ${{\mathbf H}}$ (e.g. ${i}$) has a stabiliser isomorphic to the projective rotation group ${\hbox{PSO}_2({\bf R})}$, we can view the Poincaré upper half-plane ${{\mathbf H}}$ as a homogeneous space for ${\hbox{PSL}_2({\bf R})}$, and more specifically the quotient space of ${\hbox{PSL}_2({\bf R})}$ of a maximal compact subgroup ${\hbox{PSO}_2({\bf R})}$. In fact, we can make the half-plane a symmetric space for ${\hbox{PSL}_2({\bf R})}$, by endowing ${{\mathbf H}}$ with the Riemannian metric

$\displaystyle dg^2 := \frac{dx^2 + dy^2}{y^2}$

(using Cartesian coordinates ${z=x+iy}$), which is invariant with respect to the ${\hbox{PSL}_2({\bf R})}$ action. Like any other Riemannian metric, the metric on ${{\mathbf H}}$ generates a number of other important geometric objects on ${{\mathbf H}}$, such as the distance function ${d(z,w)}$ which can be computed to be given by the formula

$\displaystyle 2(\cosh(d(z_1,z_2))-1) = \frac{|z_1-z_2|^2}{\hbox{Im}(z_1) \hbox{Im}(z_2)}, \ \ \ \ \ (2)$

the volume measure ${\mu = \mu_{\mathbf H}}$, which can be computed to be

$\displaystyle d\mu = \frac{dx dy}{y^2},$

and the Laplace-Beltrami operator, which can be computed to be ${\Delta = y^2 (\frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2})}$ (here we use the negative definite sign convention for ${\Delta}$). As the metric ${dg}$ was ${\hbox{PSL}_2({\bf R})}$-invariant, all of these quantities arising from the metric are similarly ${\hbox{PSL}_2({\bf R})}$-invariant in the appropriate sense.

The Gauss curvature of the Poincaré half-plane can be computed to be the constant ${-1}$, thus ${{\mathbf H}}$ is a model for two-dimensional hyperbolic geometry, in much the same way that the unit sphere ${S^2}$ in ${{\bf R}^3}$ is a model for two-dimensional spherical geometry (or ${{\bf R}^2}$ is a model for two-dimensional Euclidean geometry). (Indeed, ${{\mathbf H}}$ is isomorphic (via projection to a null hyperplane) to the upper unit hyperboloid ${\{ (x,t) \in {\bf R}^{2+1}: t = \sqrt{1+|x|^2}\}}$ in the Minkowski spacetime ${{\bf R}^{2+1}}$, which is the direct analogue of the unit sphere in Euclidean spacetime ${{\bf R}^3}$ or the plane ${{\bf R}^2}$ in Galilean spacetime ${{\bf R}^2 \times {\bf R}}$.)

One can inject arithmetic into this geometric structure by passing from the Lie group ${\hbox{PSL}_2({\bf R})}$ to the full modular group

$\displaystyle \hbox{PSL}_2({\bf Z}) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix}: a,b,c,d \in {\bf Z}: ad-bc = 1 \} / \{\pm 1\}$

or congruence subgroups such as

$\displaystyle \Gamma_0(q) := \{ \begin{pmatrix} a & b \\ c & d \end{pmatrix} \in \hbox{PSL}_2({\bf Z}): c = 0\ (q) \} / \{ \pm 1 \} \ \ \ \ \ (3)$

for natural number ${q}$, or to the discrete stabiliser ${\Gamma_\infty}$ of the point at infinity:

$\displaystyle \Gamma_\infty := \{ \pm \begin{pmatrix} 1 & b \\ 0 & 1 \end{pmatrix}: b \in {\bf Z} \} / \{\pm 1\}. \ \ \ \ \ (4)$

These are discrete subgroups of ${\hbox{PSL}_2({\bf R})}$, nested by the subgroup inclusions

$\displaystyle \Gamma_\infty \leq \Gamma_0(q) \leq \Gamma_0(1)=\hbox{PSL}_2({\bf Z}) \leq \hbox{PSL}_2({\bf R}).$

There are many further discrete subgroups of ${\hbox{PSL}_2({\bf R})}$ (known collectively as Fuchsian groups) that one could consider, but we will focus attention on these three groups in this post.

Any discrete subgroup ${\Gamma}$ of ${\hbox{PSL}_2({\bf R})}$ generates a quotient space ${\Gamma \backslash {\mathbf H}}$, which in general will be a non-compact two-dimensional orbifold. One can understand such a quotient space by working with a fundamental domain ${\hbox{Fund}( \Gamma \backslash {\mathbf H})}$ – a set consisting of a single representative of each of the orbits ${\Gamma z}$ of ${\Gamma}$ in ${{\mathbf H}}$. This fundamental domain is by no means uniquely defined, but if the fundamental domain is chosen with some reasonable amount of regularity, one can view ${\Gamma \backslash {\mathbf H}}$ as the fundamental domain with the boundaries glued together in an appropriate sense. Among other things, fundamental domains can be used to induce a volume measure ${\mu = \mu_{\Gamma \backslash {\mathbf H}}}$ on ${\Gamma \backslash {\mathbf H}}$ from the volume measure ${\mu = \mu_{\mathbf H}}$ on ${{\mathbf H}}$ (restricted to a fundamental domain). By abuse of notation we will refer to both measures simply as ${\mu}$ when there is no chance of confusion.

For instance, a fundamental domain for ${\Gamma_\infty \backslash {\mathbf H}}$ is given (up to null sets) by the strip ${\{ z \in {\mathbf H}: |\hbox{Re}(z)| < \frac{1}{2} \}}$, with ${\Gamma_\infty \backslash {\mathbf H}}$ identifiable with the cylinder formed by gluing together the two sides of the strip. A fundamental domain for ${\hbox{PSL}_2({\bf Z}) \backslash {\mathbf H}}$ is famously given (again up to null sets) by an upper portion ${\{ z \in {\mathbf H}: |\hbox{Re}(z)| < \frac{1}{2}; |z| > 1 \}}$, with the left and right sides again glued to each other, and the left and right halves of the circular boundary glued to itself. A fundamental domain for ${\Gamma_0(q) \backslash {\mathbf H}}$ can be formed by gluing together

$\displaystyle [\hbox{PSL}_2({\bf Z}) : \Gamma_0(q)] = q \prod_{p|q} (1 + \frac{1}{p}) = q^{1+o(1)}$

copies of a fundamental domain for ${\hbox{PSL}_2({\bf Z}) \backslash {\mathbf H}}$ in a rather complicated but interesting fashion.

While fundamental domains can be a convenient choice of coordinates to work with for some computations (as well as for drawing appropriate pictures), it is geometrically more natural to avoid working explicitly on such domains, and instead work directly on the quotient spaces ${\Gamma \backslash {\mathbf H}}$. In order to analyse functions ${f: \Gamma \backslash {\mathbf H} \rightarrow {\bf C}}$ on such orbifolds, it is convenient to lift such functions back up to ${{\mathbf H}}$ and identify them with functions ${f: {\mathbf H} \rightarrow {\bf C}}$ which are ${\Gamma}$-automorphic in the sense that ${f( \gamma z ) = f(z)}$ for all ${z \in {\mathbf H}}$ and ${\gamma \in \Gamma}$. Such functions will be referred to as ${\Gamma}$-automorphic forms, or automorphic forms for short (we always implicitly assume all such functions to be measurable). (Strictly speaking, these are the automorphic forms with trivial factor of automorphy; one can certainly consider other factors of automorphy, particularly when working with holomorphic modular forms, which corresponds to sections of a more non-trivial line bundle over ${\Gamma \backslash {\mathbf H}}$ than the trivial bundle ${(\Gamma \backslash {\mathbf H}) \times {\bf C}}$ that is implicitly present when analysing scalar functions ${f: {\mathbf H} \rightarrow {\bf C}}$. However, we will not discuss this (important) more general situation here.)

An important way to create a ${\Gamma}$-automorphic form is to start with a non-automorphic function ${f: {\mathbf H} \rightarrow {\bf C}}$ obeying suitable decay conditions (e.g. bounded with compact support will suffice) and form the Poincaré series ${P_\Gamma[f]: {\mathbf H} \rightarrow {\bf C}}$ defined by

$\displaystyle P_{\Gamma}[f](z) = \sum_{\gamma \in \Gamma} f(\gamma z),$

which is clearly ${\Gamma}$-automorphic. (One could equivalently write ${f(\gamma^{-1} z)}$ in place of ${f(\gamma z)}$ here; there are good argument for both conventions, but I have ultimately decided to use the ${f(\gamma z)}$ convention, which makes explicit computations a little neater at the cost of making the group actions work in the opposite order.) Thus we naturally see sums over ${\Gamma}$ associated with ${\Gamma}$-automorphic forms. A little more generally, given a subgroup ${\Gamma_\infty}$ of ${\Gamma}$ and a ${\Gamma_\infty}$-automorphic function ${f: {\mathbf H} \rightarrow {\bf C}}$ of suitable decay, we can form a relative Poincaré series ${P_{\Gamma_\infty \backslash \Gamma}[f]: {\mathbf H} \rightarrow {\bf C}}$ by

$\displaystyle P_{\Gamma_\infty \backslash \Gamma}[f](z) = \sum_{\gamma \in \hbox{Fund}(\Gamma_\infty \backslash \Gamma)} f(\gamma z)$

where ${\hbox{Fund}(\Gamma_\infty \backslash \Gamma)}$ is any fundamental domain for ${\Gamma_\infty \backslash \Gamma}$, that is to say a subset of ${\Gamma}$ consisting of exactly one representative for each right coset of ${\Gamma_\infty}$. As ${f}$ is ${\Gamma_\infty}$-automorphic, we see (if ${f}$ has suitable decay) that ${P_{\Gamma_\infty \backslash \Gamma}[f]}$ does not depend on the precise choice of fundamental domain, and is ${\Gamma}$-automorphic. These operations are all compatible with each other, for instance ${P_\Gamma = P_{\Gamma_\infty \backslash \Gamma} \circ P_{\Gamma_\infty}}$. A key example of Poincaré series are the Eisenstein series, although there are of course many other Poincaré series one can consider by varying the test function ${f}$.

For future reference we record the basic but fundamental unfolding identities

$\displaystyle \int_{\Gamma \backslash {\mathbf H}} P_\Gamma[f] g\ d\mu_{\Gamma \backslash {\mathbf H}} = \int_{\mathbf H} f g\ d\mu_{\mathbf H} \ \ \ \ \ (5)$

for any function ${f: {\mathbf H} \rightarrow {\bf C}}$ with sufficient decay, and any ${\Gamma}$-automorphic function ${g}$ of reasonable growth (e.g. ${f}$ bounded and compact support, and ${g}$ bounded, will suffice). Note that ${g}$ is viewed as a function on ${\Gamma \backslash {\mathbf H}}$ on the left-hand side, and as a ${\Gamma}$-automorphic function on ${{\mathbf H}}$ on the right-hand side. More generally, one has

$\displaystyle \int_{\Gamma \backslash {\mathbf H}} P_{\Gamma_\infty \backslash \Gamma}[f] g\ d\mu_{\Gamma \backslash {\mathbf H}} = \int_{\Gamma_\infty \backslash {\mathbf H}} f g\ d\mu_{\Gamma_\infty \backslash {\mathbf H}} \ \ \ \ \ (6)$

whenever ${\Gamma_\infty \leq \Gamma}$ are discrete subgroups of ${\hbox{PSL}_2({\bf R})}$, ${f}$ is a ${\Gamma_\infty}$-automorphic function with sufficient decay on ${\Gamma_\infty \backslash {\mathbf H}}$, and ${g}$ is a ${\Gamma}$-automorphic (and thus also ${\Gamma_\infty}$-automorphic) function of reasonable growth. These identities will allow us to move fairly freely between the three domains ${{\mathbf H}}$, ${\Gamma_\infty \backslash {\mathbf H}}$, and ${\Gamma \backslash {\mathbf H}}$ in our analysis.

When computing various statistics of a Poincaré series ${P_\Gamma[f]}$, such as its values ${P_\Gamma[f](z)}$ at special points ${z}$, or the ${L^2}$ quantity ${\int_{\Gamma \backslash {\mathbf H}} |P_\Gamma[f]|^2\ d\mu}$, expressions of interest to analytic number theory naturally emerge. We list three basic examples of this below, discussed somewhat informally in order to highlight the main ideas rather than the technical details.

The first example we will give concerns the problem of estimating the sum

$\displaystyle \sum_{n \leq x} \tau(n) \tau(n+1), \ \ \ \ \ (7)$

where ${\tau(n) := \sum_{d|n} 1}$ is the divisor function. This can be rewritten (by factoring ${n=bc}$ and ${n+1=ad}$) as

$\displaystyle \sum_{ a,b,c,d \in {\bf N}: ad-bc = 1} 1_{bc \leq x} \ \ \ \ \ (8)$

which is basically a sum over the full modular group ${\hbox{PSL}_2({\bf Z})}$. At this point we will “cheat” a little by moving to the related, but different, sum

$\displaystyle \sum_{a,b,c,d \in {\bf Z}: ad-bc = 1} 1_{a^2+b^2+c^2+d^2 \leq x}. \ \ \ \ \ (9)$

This sum is not exactly the same as (8), but will be a little easier to handle, and it is plausible that the methods used to handle this sum can be modified to handle (8). Observe from (2) and some calculation that the distance between ${i}$ and ${\begin{pmatrix} a & b \\ c & d \end{pmatrix} i = \frac{ai+b}{ci+d}}$ is given by the formula

$\displaystyle 2(\cosh(d(i,\begin{pmatrix} a & b \\ c & d \end{pmatrix} i))-1) = a^2+b^2+c^2+d^2 - 2$

and so one can express the above sum as

$\displaystyle 2 \sum_{\gamma \in \hbox{PSL}_2({\bf Z})} 1_{d(i,\gamma i) \leq \hbox{cosh}^{-1}(x/2)}$

(the factor of ${2}$ coming from the quotient by ${\{\pm 1\}}$ in the projective special linear group); one can express this as ${P_\Gamma[f](i)}$, where ${\Gamma = \hbox{PSL}_2({\bf Z})}$ and ${f}$ is the indicator function of the ball ${B(i, \hbox{cosh}^{-1}(x/2))}$. Thus we see that expressions such as (7) are related to evaluations of Poincaré series. (In practice, it is much better to use smoothed out versions of indicator functions in order to obtain good control on sums such as (7) or (9), but we gloss over this technical detail here.)

The second example concerns the relative

$\displaystyle \sum_{n \leq x} \tau(n^2+1) \ \ \ \ \ (10)$

of the sum (7). Note from multiplicativity that (7) can be written as ${\sum_{n \leq x} \tau(n^2+n)}$, which is superficially very similar to (10), but with the key difference that the polynomial ${n^2+1}$ is irreducible over the integers.

As with (7), we may expand (10) as

$\displaystyle \sum_{A,B,C \in {\bf N}: B^2 - AC = -1} 1_{B \leq x}.$

At first glance this does not look like a sum over a modular group, but one can manipulate this expression into such a form in one of two (closely related) ways. First, observe that any factorisation ${B + i = (a-bi) (c+di)}$ of ${B+i}$ into Gaussian integers ${a-bi, c+di}$ gives rise (upon taking norms) to an identity of the form ${B^2 - AC = -1}$, where ${A = a^2+b^2}$ and ${C = c^2+d^2}$. Conversely, by using the unique factorisation of the Gaussian integers, every identity of the form ${B^2-AC=-1}$ gives rise to a factorisation of the form ${B+i = (a-bi) (c+di)}$, essentially uniquely up to units. Now note that ${(a-bi)(c+di)}$ is of the form ${B+i}$ if and only if ${ad-bc=1}$, in which case ${B = ac+bd}$. Thus we can essentially write the above sum as something like

$\displaystyle \sum_{a,b,c,d: ad-bc = 1} 1_{|ac+bd| \leq x} \ \ \ \ \ (11)$

and one the modular group ${\hbox{PSL}_2({\bf Z})}$ is now manifest. An equivalent way to see these manipulations is as follows. A triple ${A,B,C}$ of natural numbers with ${B^2-AC=1}$ gives rise to a positive quadratic form ${Ax^2+2Bxy+Cy^2}$ of normalised discriminant ${B^2-AC}$ equal to ${-1}$ with integer coefficients (it is natural here to allow ${B}$ to take integer values rather than just natural number values by essentially doubling the sum). The group ${\hbox{PSL}_2({\bf Z})}$ acts on the space of such quadratic forms in a natural fashion (by composing the quadratic form with the inverse ${\begin{pmatrix} d & -b \\ -c & a \end{pmatrix}}$ of an element ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ of ${\hbox{SL}_2({\bf Z})}$). Because the discriminant ${-1}$ has class number one (this fact is equivalent to the unique factorisation of the gaussian integers, as discussed in this previous post), every form ${Ax^2 + 2Bxy + Cy^2}$ in this space is equivalent (under the action of some element of ${\hbox{PSL}_2({\bf Z})}$) with the standard quadratic form ${x^2+y^2}$. In other words, one has

$\displaystyle Ax^2 + 2Bxy + Cy^2 = (dx-by)^2 + (-cx+ay)^2$

which (up to a harmless sign) is exactly the representation ${B = ac+bd}$, ${A = c^2+d^2}$, ${C = a^2+b^2}$ introduced earlier, and leads to the same reformulation of the sum (10) in terms of expressions like (11). Similar considerations also apply if the quadratic polynomial ${n^2+1}$ is replaced by another quadratic, although one has to account for the fact that the class number may now exceed one (so that unique factorisation in the associated quadratic ring of integers breaks down), and in the positive discriminant case the fact that the group of units might be infinite presents another significant technical problem.

Note that ${\begin{pmatrix} a & b \\ c & d \end{pmatrix} i = \frac{ai+b}{ci+d}}$ has real part ${\frac{ac+bd}{c^2+d^2}}$ and imaginary part ${\frac{1}{c^2+d^2}}$. Thus (11) is (up to a factor of two) the Poincaré series ${P_\Gamma[f](i)}$ as in the preceding example, except that ${f}$ is now the indicator of the sector ${\{ z: |\hbox{Re} z| \leq x |\hbox{Im} z| \}}$.

Sums involving subgroups of the full modular group, such as ${\Gamma_0(q)}$, often arise when imposing congruence conditions on sums such as (10), for instance when trying to estimate the expression ${\sum_{n \leq x: q|n} \tau(n^2+1)}$ when ${q}$ and ${x}$ are large. As before, one then soon arrives at the problem of evaluating a Poincaré series at one or more special points, where the series is now over ${\Gamma_0(q)}$ rather than ${\hbox{PSL}_2({\bf Z})}$.

The third and final example concerns averages of Kloosterman sums

$\displaystyle S(m,n;c) := \sum_{x \in ({\bf Z}/c{\bf Z})^\times} e( \frac{mx + n\overline{x}}{c} ) \ \ \ \ \ (12)$

where ${e(\theta) := e^{2p\i i\theta}}$ and ${\overline{x}}$ is the inverse of ${x}$ in the multiplicative group ${({\bf Z}/c{\bf Z})^\times}$. It turns out that the ${L^2}$ norms of Poincaré series ${P_\Gamma[f]}$ or ${P_{\Gamma_\infty \backslash \Gamma}[f]}$ are closely tied to such averages. Consider for instance the quantity

$\displaystyle \int_{\Gamma_0(q) \backslash {\mathbf H}} |P_{\Gamma_\infty \backslash \Gamma_0(q)}[f]|^2\ d\mu_{\Gamma \backslash {\mathbf H}} \ \ \ \ \ (13)$

where ${q}$ is a natural number and ${f}$ is a ${\Gamma_\infty}$-automorphic form that is of the form

$\displaystyle f(x+iy) = F(my) e(m x)$

for some integer ${m}$ and some test function ${f: (0,+\infty) \rightarrow {\bf C}}$, which for sake of discussion we will take to be smooth and compactly supported. Using the unfolding formula (6), we may rewrite (13) as

$\displaystyle \int_{\Gamma_\infty \backslash {\mathbf H}} \overline{f} P_{\Gamma_\infty \backslash \Gamma_0(q)}[f]\ d\mu_{\Gamma_\infty \backslash {\mathbf H}}.$

To compute this, we use the double coset decomposition

$\displaystyle \Gamma_0(q) = \Gamma_\infty \cup \bigcup_{c \in {\mathbf N}: q|c} \bigcup_{1 \leq d \leq c: (d,c)=1} \Gamma_\infty \begin{pmatrix} a & b \\ c & d \end{pmatrix} \Gamma_\infty,$

where for each ${c,d}$, ${a,b}$ are arbitrarily chosen integers such that ${ad-bc=1}$. To see this decomposition, observe that every element ${\begin{pmatrix} a & b \\ c & d \end{pmatrix}}$ in ${\Gamma_0(q)}$ outside of ${\Gamma_\infty}$ can be assumed to have ${c>0}$ by applying a sign ${\pm}$, and then using the row and column operations coming from left and right multiplication by ${\Gamma_\infty}$ (that is, shifting the top row by an integer multiple of the bottom row, and shifting the right column by an integer multiple of the left column) one can place ${d}$ in the interval ${[1,c]}$ and ${(a,b)}$ to be any specified integer pair with ${ad-bc=1}$. From this we see that

$\displaystyle P_{\Gamma_\infty \backslash \Gamma_0(q)}[f] = f + \sum_{c \in {\mathbf N}: q|c} \sum_{1 \leq d \leq c: (d,c)=1} P_{\Gamma_\infty}[ f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} \cdot ) ]$

and so from further use of the unfolding formula (5) we may expand (13) as

$\displaystyle \int_{\Gamma_\infty \backslash {\mathbf H}} |f|^2\ d\mu_{\Gamma_\infty \backslash {\mathbf H}}$

$\displaystyle + \sum_{c \in {\mathbf N}} \sum_{1 \leq d \leq c: (d,c)=1} \int_{\mathbf H} \overline{f}(z) f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} z)\ d\mu_{\mathbf H}.$

The first integral is just ${m \int_0^\infty |F(y)|^2 \frac{dy}{y^2}}$. The second expression is more interesting. We have

$\displaystyle \begin{pmatrix} a & b \\ c & d \end{pmatrix} z = \frac{az+b}{cz+d} = \frac{a}{c} - \frac{1}{c(cz+d)}$

$\displaystyle = \frac{a}{c} - \frac{cx+d}{c((cx+d)^2+c^2y^2)} + \frac{iy}{(cx+d)^2 + c^2y^2}$

so we can write

$\displaystyle \int_{\mathbf H} \overline{f}(z) f( \begin{pmatrix} a & b \\ c & d \end{pmatrix} z)\ d\mu_{\mathbf H}$

as

$\displaystyle \int_0^\infty \int_{\bf R} \overline{F}(my) F(\frac{imy}{(cx+d)^2 + c^2y^2}) e( -mx + \frac{ma}{c} - m \frac{cx+d}{c((cx+d)^2+c^2y^2)} )$

$\displaystyle \frac{dx dy}{y^2}$

which on shifting ${x}$ by ${d/c}$ simplifies a little to

$\displaystyle e( \frac{ma}{c} + \frac{md}{c} ) \int_0^\infty \int_{\bf R} F(my) \bar{F}(\frac{imy}{c^2(x^2 + y^2)}) e(- mx - m \frac{x}{c^2(x^2+y^2)} )$

$\displaystyle \frac{dx dy}{y^2}$

and then on scaling ${x,y}$ by ${m}$ simplifies a little further to

$\displaystyle e( \frac{ma}{c} + \frac{md}{c} ) \int_0^\infty \int_{\bf R} F(y) \bar{F}(\frac{m^2}{c^2} \frac{iy}{x^2 + y^2}) e(- x - \frac{m^2}{c^2} \frac{x}{x^2+y^2} )\ \frac{dx dy}{y^2}.$

Note that as ${ad-bc=1}$, we have ${a = \overline{d}}$ modulo ${c}$. Comparing the above calculations with (12), we can thus write (13) as

$\displaystyle m (\int_0^\infty |F(y)|^2 \frac{dy}{y^2} + \sum_{q|c} \frac{S(m,m;c)}{c} V(\frac{m}{c})) \ \ \ \ \ (14)$

where

$\displaystyle V(u) := \frac{1}{u} \int_0^\infty \int_{\bf R} F(y) \bar{F}(u^2 \frac{y}{x^2 + y^2}) e(- x - u^2 \frac{x}{x^2+y^2} )\ \frac{dx dy}{y^2}$

is a certain integral involving ${F}$ and a parameter ${u}$, but which does not depend explicitly on parameters such as ${m,c,d}$. Thus we have indeed expressed the ${L^2}$ expression (13) in terms of Kloosterman sums. It is possible to invert this analysis and express varius weighted sums of Kloosterman sums in terms of ${L^2}$ expressions (possibly involving inner products instead of norms) of Poincaré series, but we will not do so here; see Chapter 16 of Iwaniec and Kowalski for further details.

Traditionally, automorphic forms have been analysed using the spectral theory of the Laplace-Beltrami operator ${-\Delta}$ on spaces such as ${\Gamma\backslash {\mathbf H}}$ or ${\Gamma_\infty \backslash {\mathbf H}}$, so that a Poincaré series such as ${P_\Gamma[f]}$ might be expanded out using inner products of ${P_\Gamma[f]}$ (or, by the unfolding identities, ${f}$) with various generalised eigenfunctions of ${-\Delta}$ (such as cuspidal eigenforms, or Eisenstein series). With this approach, special functions, and specifically the modified Bessel functions ${K_{it}}$ of the second kind, play a prominent role, basically because the ${\Gamma_\infty}$-automorphic functions

$\displaystyle x+iy \mapsto y^{1/2} K_{it}(2\pi |m| y) e(mx)$

for ${t \in {\bf R}}$ and ${m \in {\bf Z}}$ non-zero are generalised eigenfunctions of ${-\Delta}$ (with eigenvalue ${\frac{1}{4}+t^2}$), and are almost square-integrable on ${\Gamma_\infty \backslash {\mathbf H}}$ (the ${L^2}$ norm diverges only logarithmically at one end ${y \rightarrow 0^+}$ of the cylinder ${\Gamma_\infty \backslash {\mathbf H}}$, while decaying exponentially fast at the other end ${y \rightarrow +\infty}$).

However, as discussed in this previous post, the spectral theory of an essentially self-adjoint operator such as ${-\Delta}$ is basically equivalent to the theory of various solution operators associated to partial differential equations involving that operator, such as the Helmholtz equation ${(-\Delta + k^2) u = f}$, the heat equation ${\partial_t u = \Delta u}$, the Schrödinger equation ${i\partial_t u + \Delta u = 0}$, or the wave equation ${\partial_{tt} u = \Delta u}$. Thus, one can hope to rephrase many arguments that involve spectral data of ${-\Delta}$ into arguments that instead involve resolvents ${(-\Delta + k^2)^{-1}}$, heat kernels ${e^{t\Delta}}$, Schrödinger propagators ${e^{it\Delta}}$, or wave propagators ${e^{\pm it\sqrt{-\Delta}}}$, or involve the PDE more directly (e.g. applying integration by parts and energy methods to solutions of such PDE). This is certainly done to some extent in the existing literature; resolvents and heat kernels, for instance, are often utilised. In this post, I would like to explore the possibility of reformulating spectral arguments instead using the inhomogeneous wave equation

$\displaystyle \partial_{tt} u - \Delta u = F.$

Actually it will be a bit more convenient to normalise the Laplacian by ${\frac{1}{4}}$, and look instead at the automorphic wave equation

$\displaystyle \partial_{tt} u + (-\Delta - \frac{1}{4}) u = F. \ \ \ \ \ (15)$

This equation somewhat resembles a “Klein-Gordon” type equation, except that the mass is imaginary! This would lead to pathological behaviour were it not for the negative curvature, which in principle creates a spectral gap of ${\frac{1}{4}}$ that cancels out this factor.

The point is that the wave equation approach gives access to some nice PDE techniques, such as energy methods, Sobolev inequalities and finite speed of propagation, which are somewhat submerged in the spectral framework. The wave equation also interacts well with Poincaré series; if for instance ${u}$ and ${F}$ are ${\Gamma_\infty}$-automorphic solutions to (15) obeying suitable decay conditions, then their Poincaré series ${P_{\Gamma_\infty \backslash \Gamma}[u]}$ and ${P_{\Gamma_\infty \backslash \Gamma}[F]}$ will be ${\Gamma}$-automorphic solutions to the same equation (15), basically because the Laplace-Beltrami operator commutes with translations. Because of these facts, it is possible to replicate several standard spectral theory arguments in the wave equation framework, without having to deal directly with things like the asymptotics of modified Bessel functions. The wave equation approach to automorphic theory was introduced by Faddeev and Pavlov (using the Lax-Phillips scattering theory), and developed further by by Lax and Phillips, to recover many spectral facts about the Laplacian on modular curves, such as the Weyl law and the Selberg trace formula. Here, I will illustrate this by deriving three basic applications of automorphic methods in a wave equation framework, namely

• Using the Weil bound on Kloosterman sums to derive Selberg’s 3/16 theorem on the least non-trivial eigenvalue for ${-\Delta}$ on ${\Gamma_0(q) \backslash {\mathbf H}}$ (discussed previously here);
• Conversely, showing that Selberg’s eigenvalue conjecture (improving Selberg’s ${3/16}$ bound to the optimal ${1/4}$) implies an optimal bound on (smoothed) sums of Kloosterman sums; and
• Using the same bound to obtain pointwise bounds on Poincaré series similar to the ones discussed above. (Actually, the argument here does not use the wave equation, instead it just uses the Sobolev inequality.)

This post originated from an attempt to finally learn this part of analytic number theory properly, and to see if I could use a PDE-based perspective to understand it better. Ultimately, this is not that dramatic a depature from the standard approach to this subject, but I found it useful to think of things in this fashion, probably due to my existing background in PDE.

I thank Bill Duke and Ben Green for helpful discussions. My primary reference for this theory was Chapters 15, 16, and 21 of Iwaniec and Kowalski.

The wave equation is usually expressed in the form

$\displaystyle \partial_{tt} u - \Delta u = 0$

where ${u \colon {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ is a function of both time ${t \in {\bf R}}$ and space ${x \in {\bf R}^d}$, with ${\Delta}$ being the Laplacian operator. One can generalise this equation in a number of ways, for instance by replacing the spatial domain ${{\bf R}^d}$ with some other manifold and replacing the Laplacian ${\Delta}$ with the Laplace-Beltrami operator or adding lower order terms (such as a potential, or a coupling with a magnetic field). But for sake of discussion let us work with the classical wave equation on ${{\bf R}^d}$. We will work formally in this post, being unconcerned with issues of convergence, justifying interchange of integrals, derivatives, or limits, etc.. One then has a conserved energy

$\displaystyle \int_{{\bf R}^d} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx$

which we can rewrite using integration by parts and the ${L^2}$ inner product ${\langle, \rangle}$ on ${{\bf R}^d}$ as

$\displaystyle \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle \partial_t u(t), \partial_t u(t) \rangle.$

A key feature of the wave equation is finite speed of propagation: if, at time ${t=0}$ (say), the initial position ${u(0)}$ and initial velocity ${\partial_t u(0)}$ are both supported in a ball ${B(x_0,R) := \{ x \in {\bf R}^d: |x-x_0| \leq R \}}$, then at any later time ${t>0}$, the position ${u(t)}$ and velocity ${\partial_t u(t)}$ are supported in the larger ball ${B(x_0,R+t)}$. This can be seen for instance (formally, at least) by inspecting the exterior energy

$\displaystyle \int_{|x-x_0| > R+t} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx$

and observing (after some integration by parts and differentiation under the integral sign) that it is non-increasing in time, non-negative, and vanishing at time ${t=0}$.

The wave equation is second order in time, but one can turn it into a first order system by working with the pair ${(u(t),v(t))}$ rather than just the single field ${u(t)}$, where ${v(t) := \partial_t u(t)}$ is the velocity field. The system is then

$\displaystyle \partial_t u(t) = v(t)$

$\displaystyle \partial_t v(t) = \Delta u(t)$

and the conserved energy is now

$\displaystyle \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (1)$

Finite speed of propagation then tells us that if ${u(0),v(0)}$ are both supported on ${B(x_0,R)}$, then ${u(t),v(t)}$ are supported on ${B(x_0,R+t)}$ for all ${t>0}$. One also has time reversal symmetry: if ${t \mapsto (u(t),v(t))}$ is a solution, then ${t \mapsto (u(-t), -v(-t))}$ is a solution also, thus for instance one can establish an analogue of finite speed of propagation for negative times ${t<0}$ using this symmetry.

If one has an eigenfunction

$\displaystyle -\Delta \phi = \lambda^2 \phi$

of the Laplacian, then we have the explicit solutions

$\displaystyle u(t) = e^{\pm it \lambda} \phi$

$\displaystyle v(t) = \pm i \lambda e^{\pm it \lambda} \phi$

of the wave equation, which formally can be used to construct all other solutions via the principle of superposition.

When one has vanishing initial velocity ${v(0)=0}$, the solution ${u(t)}$ is given via functional calculus by

$\displaystyle u(t) = \cos(t \sqrt{-\Delta}) u(0)$

and the propagator ${\cos(t \sqrt{-\Delta})}$ can be expressed as the average of half-wave operators:

$\displaystyle \cos(t \sqrt{-\Delta}) = \frac{1}{2} ( e^{it\sqrt{-\Delta}} + e^{-it\sqrt{-\Delta}} ).$

One can view ${\cos(t \sqrt{-\Delta} )}$ as a minor of the full wave propagator

$\displaystyle U(t) := \exp \begin{pmatrix} 0 & t \\ t\Delta & 0 \end{pmatrix}$

$\displaystyle = \begin{pmatrix} \cos(t \sqrt{-\Delta}) & \frac{\sin(t\sqrt{-\Delta})}{\sqrt{-\Delta}} \\ \sin(t\sqrt{-\Delta}) \sqrt{-\Delta} & \cos(t \sqrt{-\Delta} ) \end{pmatrix}$

which is unitary with respect to the energy form (1), and is the fundamental solution to the wave equation in the sense that

$\displaystyle \begin{pmatrix} u(t) \\ v(t) \end{pmatrix} = U(t) \begin{pmatrix} u(0) \\ v(0) \end{pmatrix}. \ \ \ \ \ (2)$

Viewing the contraction ${\cos(t\sqrt{-\Delta})}$ as a minor of a unitary operator is an instance of the “dilation trick“.

It turns out (as I learned from Yuval Peres) that there is a useful discrete analogue of the wave equation (and of all of the above facts), in which the time variable ${t}$ now lives on the integers ${{\bf Z}}$ rather than on ${{\bf R}}$, and the spatial domain can be replaced by discrete domains also (such as graphs). Formally, the system is now of the form

$\displaystyle u(t+1) = P u(t) + v(t) \ \ \ \ \ (3)$

$\displaystyle v(t+1) = P v(t) - (1-P^2) u(t)$

where ${t}$ is now an integer, ${u(t), v(t)}$ take values in some Hilbert space (e.g. ${\ell^2}$ functions on a graph ${G}$), and ${P}$ is some operator on that Hilbert space (which in applications will usually be a self-adjoint contraction). To connect this with the classical wave equation, let us first consider a rescaling of this system

$\displaystyle u(t+\varepsilon) = P_\varepsilon u(t) + \varepsilon v(t)$

$\displaystyle v(t+\varepsilon) = P_\varepsilon v(t) - \frac{1}{\varepsilon} (1-P_\varepsilon^2) u(t)$

where ${\varepsilon>0}$ is a small parameter (representing the discretised time step), ${t}$ now takes values in the integer multiples ${\varepsilon {\bf Z}}$ of ${\varepsilon}$, and ${P_\varepsilon}$ is the wave propagator operator ${P_\varepsilon := \cos( \varepsilon \sqrt{-\Delta} )}$ or the heat propagator ${P_\varepsilon := \exp( - \varepsilon^2 \Delta/2 )}$ (the two operators are different, but agree to fourth order in ${\varepsilon}$). One can then formally verify that the wave equation emerges from this rescaled system in the limit ${\varepsilon \rightarrow 0}$. (Thus, ${P}$ is not exactly the direct analogue of the Laplacian ${\Delta}$, but can be viewed as something like ${P_\varepsilon = 1 - \frac{\varepsilon^2}{2} \Delta + O( \varepsilon^4 )}$ in the case of small ${\varepsilon}$, or ${P = 1 - \frac{1}{2}\Delta + O(\Delta^2)}$ if we are not rescaling to the small ${\varepsilon}$ case. The operator ${P}$ is sometimes known as the diffusion operator)

Assuming ${P}$ is self-adjoint, solutions to the system (3) formally conserve the energy

$\displaystyle \frac{1}{2} \langle (1-P^2) u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (4)$

This energy is positive semi-definite if ${P}$ is a contraction. We have the same time reversal symmetry as before: if ${t \mapsto (u(t),v(t))}$ solves the system (3), then so does ${t \mapsto (u(-t), -v(-t))}$. If one has an eigenfunction

$\displaystyle P \phi = \cos(\lambda) \phi$

to the operator ${P}$, then one has an explicit solution

$\displaystyle u(t) = e^{\pm it \lambda} \phi$

$\displaystyle v(t) = \pm i \sin(\lambda) e^{\pm it \lambda} \phi$

to (3), and (in principle at least) this generates all other solutions via the principle of superposition.

Finite speed of propagation is a lot easier in the discrete setting, though one has to offset the support of the “velocity” field ${v}$ by one unit. Suppose we know that ${P}$ has unit speed in the sense that whenever ${f}$ is supported in a ball ${B(x,R)}$, then ${Pf}$ is supported in the ball ${B(x,R+1)}$. Then an easy induction shows that if ${u(0), v(0)}$ are supported in ${B(x_0,R), B(x_0,R+1)}$ respectively, then ${u(t), v(t)}$ are supported in ${B(x_0,R+t), B(x_0, R+t+1)}$.

The fundamental solution ${U(t) = U^t}$ to the discretised wave equation (3), in the sense of (2), is given by the formula

$\displaystyle U(t) = U^t = \begin{pmatrix} P & 1 \\ P^2-1 & P \end{pmatrix}^t$

$\displaystyle = \begin{pmatrix} T_t(P) & U_{t-1}(P) \\ (P^2-1) U_{t-1}(P) & T_t(P) \end{pmatrix}$

where ${T_t}$ and ${U_t}$ are the Chebyshev polynomials of the first and second kind, thus

$\displaystyle T_t( \cos \theta ) = \cos(t\theta)$

and

$\displaystyle U_t( \cos \theta ) = \frac{\sin((t+1)\theta)}{\sin \theta}.$

In particular, ${P}$ is now a minor of ${U(1) = U}$, and can also be viewed as an average of ${U}$ with its inverse ${U^{-1}}$:

$\displaystyle \begin{pmatrix} P & 0 \\ 0 & P \end{pmatrix} = \frac{1}{2} (U + U^{-1}). \ \ \ \ \ (5)$

As before, ${U}$ is unitary with respect to the energy form (4), so this is another instance of the dilation trick in action. The powers ${P^n}$ and ${U^n}$ are discrete analogues of the heat propagators ${e^{t\Delta/2}}$ and wave propagators ${U(t)}$ respectively.

One nice application of all this formalism, which I learned from Yuval Peres, is the Varopoulos-Carne inequality:

Theorem 1 (Varopoulos-Carne inequality) Let ${G}$ be a (possibly infinite) regular graph, let ${n \geq 1}$, and let ${x, y}$ be vertices in ${G}$. Then the probability that the simple random walk at ${x}$ lands at ${y}$ at time ${n}$ is at most ${2 \exp( - d(x,y)^2 / 2n )}$, where ${d}$ is the graph distance.

This general inequality is quite sharp, as one can see using the standard Cayley graph on the integers ${{\bf Z}}$. Very roughly speaking, it asserts that on a regular graph of reasonably controlled growth (e.g. polynomial growth), random walks of length ${n}$ concentrate on the ball of radius ${O(\sqrt{n})}$ or so centred at the origin of the random walk.

Proof: Let ${P \colon \ell^2(G) \rightarrow \ell^2(G)}$ be the graph Laplacian, thus

$\displaystyle Pf(x) = \frac{1}{D} \sum_{y \sim x} f(y)$

for any ${f \in \ell^2(G)}$, where ${D}$ is the degree of the regular graph and sum is over the ${D}$ vertices ${y}$ that are adjacent to ${x}$. This is a contraction of unit speed, and the probability that the random walk at ${x}$ lands at ${y}$ at time ${n}$ is

$\displaystyle \langle P^n \delta_x, \delta_y \rangle$

where ${\delta_x, \delta_y}$ are the Dirac deltas at ${x,y}$. Using (5), we can rewrite this as

$\displaystyle \langle (\frac{1}{2} (U + U^{-1}))^n \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle$

where we are now using the energy form (4). We can write

$\displaystyle (\frac{1}{2} (U + U^{-1}))^n = {\bf E} U^{S_n}$

where ${S_n}$ is the simple random walk of length ${n}$ on the integers, that is to say ${S_n = \xi_1 + \dots + \xi_n}$ where ${\xi_1,\dots,\xi_n = \pm 1}$ are independent uniform Bernoulli signs. Thus we wish to show that

$\displaystyle {\bf E} \langle U^{S_n} \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle \leq 2 \exp(-d(x,y)^2 / 2n ).$

By finite speed of propagation, the inner product here vanishes if ${|S_n| < d(x,y)}$. For ${|S_n| \geq d(x,y)}$ we can use Cauchy-Schwarz and the unitary nature of ${U}$ to bound the inner product by ${1}$. Thus the left-hand side may be upper bounded by

$\displaystyle {\bf P}( |S_n| \geq d(x,y) )$

and the claim now follows from the Chernoff inequality. $\Box$

This inequality has many applications, particularly with regards to relating the entropy, mixing time, and concentration of random walks with volume growth of balls; see this text of Lyons and Peres for some examples.

For sake of comparison, here is a continuous counterpart to the Varopoulos-Carne inequality:

Theorem 2 (Continuous Varopoulos-Carne inequality) Let ${t > 0}$, and let ${f,g \in L^2({\bf R}^d)}$ be supported on compact sets ${F,G}$ respectively. Then

$\displaystyle |\langle e^{t\Delta/2} f, g \rangle| \leq \sqrt{\frac{2t}{\pi d(F,G)^2}} \exp( - d(F,G)^2 / 2t ) \|f\|_{L^2} \|g\|_{L^2}$

where ${d(F,G)}$ is the Euclidean distance between ${F}$ and ${G}$.

Proof: By Fourier inversion one has

$\displaystyle e^{-t\xi^2/2} = \frac{1}{\sqrt{2\pi t}} \int_{\bf R} e^{-s^2/2t} e^{is\xi}\ ds$

$\displaystyle = \sqrt{\frac{2}{\pi t}} \int_0^\infty e^{-s^2/2t} \cos(s \xi )\ ds$

for any real ${\xi}$, and thus

$\displaystyle \langle e^{t\Delta/2} f, g\rangle = \sqrt{\frac{2}{\pi}} \int_0^\infty e^{-s^2/2t} \langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds.$

By finite speed of propagation, the inner product ${\langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds}$ vanishes when ${s < d(F,G)}$; otherwise, we can use Cauchy-Schwarz and the contractive nature of ${\cos(s \sqrt{-\Delta} )}$ to bound this inner product by ${\|f\|_{L^2} \|g\|_{L^2}}$. Thus

$\displaystyle |\langle e^{t\Delta/2} f, g\rangle| \leq \sqrt{\frac{2}{\pi t}} \|f\|_{L^2} \|g\|_{L^2} \int_{d(F,G)}^\infty e^{-s^2/2t}\ ds.$

Bounding ${e^{-s^2/2t}}$ by ${e^{-d(F,G)^2/2t} e^{-d(F,G) (s-d(F,G))/t}}$, we obtain the claim. $\Box$

Observe that the argument is quite general and can be applied for instance to other Riemannian manifolds than ${{\bf R}^d}$.

Consider the free Schrödinger equation in ${d}$ spatial dimensions, which I will normalise as

$\displaystyle i u_t + \frac{1}{2} \Delta_{{\bf R}^d} u = 0 \ \ \ \ \ (1)$

where ${u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ is the unknown field and ${\Delta_{{\bf R}^{d+1}} = \sum_{j=1}^d \frac{\partial^2}{\partial x_j^2}}$ is the spatial Laplacian. To avoid irrelevant technical issues I will restrict attention to smooth (classical) solutions to this equation, and will work locally in spacetime avoiding issues of decay at infinity (or at other singularities); I will also avoid issues involving branch cuts of functions such as ${t^{d/2}}$ (if one wishes, one can restrict ${d}$ to be even in order to safely ignore all branch cut issues). The space of solutions to (1) enjoys a number of symmetries. A particularly non-obvious symmetry is the pseudoconformal symmetry: if ${u}$ solves (1), then the pseudoconformal solution ${pc(u): {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ defined by

$\displaystyle pc(u)(t,x) := \frac{1}{(it)^{d/2}} \overline{u(\frac{1}{t}, \frac{x}{t})} e^{i|x|^2/2t} \ \ \ \ \ (2)$

for ${t \neq 0}$ can be seen after some computation to also solve (1). (If ${u}$ has suitable decay at spatial infinity and one chooses a suitable branch cut for ${(it)^{d/2}}$, one can extend ${pc(u)}$ continuously to the ${t=0}$ spatial slice, whereupon it becomes essentially the spatial Fourier transform of ${u(0,\cdot)}$, but we will not need this fact for the current discussion.)

An analogous symmetry exists for the free wave equation in ${d+1}$ spatial dimensions, which I will write as

$\displaystyle u_{tt} - \Delta_{{\bf R}^{d+1}} u = 0 \ \ \ \ \ (3)$

where ${u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ is the unknown field. In analogy to pseudoconformal symmetry, we have conformal symmetry: if ${u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ solves (3), then the function ${conf(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$, defined in the interior ${\{ (t,x): |x| < |t| \}}$ of the light cone by the formula

$\displaystyle conf(u)(t,x) := (t^2-|x|^2)^{-d/2} u( \frac{t}{t^2-|x|^2}, \frac{x}{t^2-|x|^2} ), \ \ \ \ \ (4)$

also solves (3).

There are also some direct links between the Schrödinger equation in ${d}$ dimensions and the wave equation in ${d+1}$ dimensions. This can be easily seen on the spacetime Fourier side: solutions to (1) have spacetime Fourier transform (formally) supported on a ${d}$-dimensional hyperboloid, while solutions to (3) have spacetime Fourier transform formally supported on a ${d+1}$-dimensional cone. To link the two, one then observes that the ${d}$-dimensional hyperboloid can be viewed as a conic section (i.e. hyperplane slice) of the ${d+1}$-dimensional cone. In physical space, this link is manifested as follows: if ${u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}}$ solves (1), then the function ${\iota_{1}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ defined by

$\displaystyle \iota_{1}(u)(t,x_1,\ldots,x_{d+1}) := e^{-i(t+x_{d+1})} u( \frac{t-x_{d+1}}{2}, x_1,\ldots,x_d)$

solves (3). More generally, for any non-zero scaling parameter ${\lambda}$, the function ${\iota_{\lambda}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}$ defined by

$\displaystyle \iota_{\lambda}(u)(t,x_1,\ldots,x_{d+1}) :=$

$\displaystyle \lambda^{d/2} e^{-i\lambda(t+x_{d+1})} u( \lambda \frac{t-x_{d+1}}{2}, \lambda x_1,\ldots,\lambda x_d) \ \ \ \ \ (5)$

solves (3).

As an “extra challenge” posed in an exercise in one of my books (Exercise 2.28, to be precise), I asked the reader to use the embeddings ${\iota_1}$ (or more generally ${\iota_\lambda}$) to explicitly connect together the pseudoconformal transformation ${pc}$ and the conformal transformation ${conf}$. It turns out that this connection is a little bit unusual, with the “obvious” guess (namely, that the embeddings ${\iota_\lambda}$ intertwine ${pc}$ and ${conf}$) being incorrect, and as such this particular task was perhaps too difficult even for a challenge question. I’ve been asked a couple times to provide the connection more explicitly, so I will do so below the fold.

LLet ${L: H \rightarrow H}$ be a self-adjoint operator on a finite-dimensional Hilbert space ${H}$. The behaviour of this operator can be completely described by the spectral theorem for finite-dimensional self-adjoint operators (i.e. Hermitian matrices, when viewed in coordinates), which provides a sequence ${\lambda_1,\ldots,\lambda_n \in {\bf R}}$ of eigenvalues and an orthonormal basis ${e_1,\ldots,e_n}$ of eigenfunctions such that ${L e_i = \lambda_i e_i}$ for all ${i=1,\ldots,n}$. In particular, given any function ${m: \sigma(L) \rightarrow {\bf C}}$ on the spectrum ${\sigma(L) := \{ \lambda_1,\ldots,\lambda_n\}}$ of ${L}$, one can then define the linear operator ${m(L): H \rightarrow H}$ by the formula

$\displaystyle m(L) e_i := m(\lambda_i) e_i,$

which then gives a functional calculus, in the sense that the map ${m \mapsto m(L)}$ is a ${C^*}$-algebra isometric homomorphism from the algebra ${BC(\sigma(L) \rightarrow {\bf C})}$ of bounded continuous functions from ${\sigma(L)}$ to ${{\bf C}}$, to the algebra ${B(H \rightarrow H)}$ of bounded linear operators on ${H}$. Thus, for instance, one can define heat operators ${e^{-tL}}$ for ${t>0}$, Schrödinger operators ${e^{itL}}$ for ${t \in {\bf R}}$, resolvents ${\frac{1}{L-z}}$ for ${z \not \in \sigma(L)}$, and (if ${L}$ is positive) wave operators ${e^{it\sqrt{L}}}$ for ${t \in {\bf R}}$. These will be bounded operators (and, in the case of the Schrödinger and wave operators, unitary operators, and in the case of the heat operators with ${L}$ positive, they will be contractions). Among other things, this functional calculus can then be used to solve differential equations such as the heat equation

$\displaystyle u_t + Lu = 0; \quad u(0) = f \ \ \ \ \ (1)$

the Schrödinger equation

$\displaystyle u_t + iLu = 0; \quad u(0) = f \ \ \ \ \ (2)$

the wave equation

$\displaystyle u_{tt} + Lu = 0; \quad u(0) = f; \quad u_t(0) = g \ \ \ \ \ (3)$

or the Helmholtz equation

$\displaystyle (L-z) u = f. \ \ \ \ \ (4)$

The functional calculus can also be associated to a spectral measure. Indeed, for any vectors ${f, g \in H}$, there is a complex measure ${\mu_{f,g}}$ on ${\sigma(L)}$ with the property that

$\displaystyle \langle m(L) f, g \rangle_H = \int_{\sigma(L)} m(x) d\mu_{f,g}(x);$

indeed, one can set ${\mu_{f,g}}$ to be the discrete measure on ${\sigma(L)}$ defined by the formula

$\displaystyle \mu_{f,g}(E) := \sum_{i: \lambda_i \in E} \langle f, e_i \rangle_H \langle e_i, g \rangle_H.$

One can also view this complex measure as a coefficient

$\displaystyle \mu_{f,g} = \langle \mu f, g \rangle_H$

of a projection-valued measure ${\mu}$ on ${\sigma(L)}$, defined by setting

$\displaystyle \mu(E) f := \sum_{i: \lambda_i \in E} \langle f, e_i \rangle_H e_i.$

Finally, one can view ${L}$ as unitarily equivalent to a multiplication operator ${M: f \mapsto g f}$ on ${\ell^2(\{1,\ldots,n\})}$, where ${g}$ is the real-valued function ${g(i) := \lambda_i}$, and the intertwining map ${U: \ell^2(\{1,\ldots,n\}) \rightarrow H}$ is given by

$\displaystyle U ( (c_i)_{i=1}^n ) := \sum_{i=1}^n c_i e_i,$

so that ${L = U M U^{-1}}$.

It is an important fact in analysis that many of these above assertions extend to operators on an infinite-dimensional Hilbert space ${H}$, so long as one one is careful about what “self-adjoint operator” means; these facts are collectively referred to as the spectral theorem. For instance, it turns out that most of the above claims have analogues for bounded self-adjoint operators ${L: H \rightarrow H}$. However, in the theory of partial differential equations, one often needs to apply the spectral theorem to unbounded, densely defined linear operators ${L: D \rightarrow H}$, which (initially, at least), are only defined on a dense subspace ${D}$ of the Hilbert space ${H}$. A very typical situation arises when ${H = L^2(\Omega)}$ is the square-integrable functions on some domain or manifold ${\Omega}$ (which may have a boundary or be otherwise “incomplete”), and ${D = C^\infty_c(\Omega)}$ are the smooth compactly supported functions on ${\Omega}$, and ${L}$ is some linear differential operator. It is then of interest to obtain the spectral theorem for such operators, so that one build operators such as ${e^{-tL}, e^{itL}, \frac{1}{L-z}, e^{it\sqrt{L}}}$ or to solve equations such as (1), (2), (3), (4).

In order to do this, some necessary conditions on the densely defined operator ${L: D \rightarrow H}$ must be imposed. The most obvious is that of symmetry, which asserts that

$\displaystyle \langle Lf, g \rangle_H = \langle f, Lg \rangle_H \ \ \ \ \ (5)$

for all ${f, g \in D}$. In some applications, one also wants to impose positive definiteness, which asserts that

$\displaystyle \langle Lf, f \rangle_H \geq 0 \ \ \ \ \ (6)$

for all ${f \in D}$. These hypotheses are sufficient in the case when ${L}$ is bounded, and in particular when ${H}$ is finite dimensional. However, as it turns out, for unbounded operators these conditions are not, by themselves, enough to obtain a good spectral theory. For instance, one consequence of the spectral theorem should be that the resolvents ${(L-z)^{-1}}$ are well-defined for any strictly complex ${z}$, which by duality implies that the image of ${L-z}$ should be dense in ${H}$. However, this can fail if one just assumes symmetry, or symmetry and positive definiteness. A well-known example occurs when ${H}$ is the Hilbert space ${H := L^2((0,1))}$, ${D := C^\infty_c((0,1))}$ is the space of test functions, and ${L}$ is the one-dimensional Laplacian ${L := -\frac{d^2}{dx^2}}$. Then ${L}$ is symmetric and positive, but the operator ${L-k^2}$ does not have dense image for any complex ${k}$, since

$\displaystyle \langle (L-\overline{k}^2) f, e^{\overline{k}x} \rangle_H = 0$

for all test functions ${f \in C^\infty_c((0,1))}$, as can be seen from a routine integration by parts. As such, the resolvent map is not everywhere uniquely defined. There is also a lack of uniqueness for the wave, heat, and Schrödinger equations for this operator (note that there are no spatial boundary conditions specified in these equations).

Another example occurs when ${H := L^2((0,+\infty))}$, ${D := C^\infty_c((0,+\infty))}$, ${L}$ is the momentum operator ${L := i \frac{d}{dx}}$. Then the resolvent ${(L-z)^{-1}}$ can be uniquely defined for ${z}$ in the upper half-plane, but not in the lower half-plane, due to the obstruction

$\displaystyle \langle (L-z) f, e^{i \bar{z} x} \rangle_H = 0$

for all test functions ${f}$ (note that the function ${e^{i\bar{z} x}}$ lies in ${L^2((0,+\infty))}$ when ${z}$ is in the lower half-plane). For related reasons, the translation operators ${e^{itL}}$ have a problem with either uniqueness or existence (depending on whether ${t}$ is positive or negative), due to the unspecified boundary behaviour at the origin.

The key property that lets one avoid this bad behaviour is that of essential self-adjointness. Once ${L}$ is essentially self-adjoint, then spectral theorem becomes applicable again, leading to all the expected behaviour (e.g. existence and uniqueness for the various PDE given above).

Unfortunately, the concept of essential self-adjointness is defined rather abstractly, and is difficult to verify directly; unlike the symmetry condition (5) or the positive condition (6), it is not a “local” condition that can be easily verified just by testing ${L}$ on various inputs, but is instead a more “global” condition. In practice, to verify this property, one needs to invoke one of a number of a partial converses to the spectral theorem, which roughly speaking asserts that if at least one of the expected consequences of the spectral theorem is true for some symmetric densely defined operator ${L}$, then ${L}$ is self-adjoint. Examples of “expected consequences” include:

• Existence of resolvents ${(L-z)^{-1}}$ (or equivalently, dense image for ${L-z}$);
• Existence of a contractive heat propagator semigroup ${e^{tL}}$ (in the positive case);
• Existence of a unitary Schrödinger propagator group ${e^{itL}}$;
• Existence of a unitary wave propagator group ${e^{it\sqrt{L}}}$ (in the positive case);
• Existence of a “reasonable” functional calculus.
• Unitary equivalence with a multiplication operator.

Thus, to actually verify essential self-adjointness of a differential operator, one typically has to first solve a PDE (such as the wave, Schrödinger, heat, or Helmholtz equation) by some non-spectral method (e.g. by a contraction mapping argument, or a perturbation argument based on an operator already known to be essentially self-adjoint). Once one can solve one of the PDEs, then one can apply one of the known converse spectral theorems to obtain essential self-adjointness, and then by the forward spectral theorem one can then solve all the other PDEs as well. But there is no getting out of that first step, which requires some input (typically of an ODE, PDE, or geometric nature) that is external to what abstract spectral theory can provide. For instance, if one wants to establish essential self-adjointness of the Laplace-Beltrami operator ${L = -\Delta_g}$ on a smooth Riemannian manifold ${(M,g)}$ (using ${C^\infty_c(M)}$ as the domain space), it turns out (under reasonable regularity hypotheses) that essential self-adjointness is equivalent to geodesic completeness of the manifold, which is a global ODE condition rather than a local one: one needs geodesics to continue indefinitely in order to be able to (unitarily) solve PDEs such as the wave equation, which in turn leads to essential self-adjointness. (Note that the domains ${(0,1)}$ and ${(0,+\infty)}$ in the previous examples were not geodesically complete.) For this reason, essential self-adjointness of a differential operator is sometimes referred to as quantum completeness (with the completeness of the associated Hamilton-Jacobi flow then being the analogous classical completeness).

In these notes, I wanted to record (mostly for my own benefit) the forward and converse spectral theorems, and to verify essential self-adjointness of the Laplace-Beltrami operator on geodesically complete manifolds. This is extremely standard analysis (covered, for instance, in the texts of Reed and Simon), but I wanted to write it down myself to make sure that I really understood this foundational material properly.

Hans Lindblad and I have just uploaded to the arXiv our joint paper “Asymptotic decay for a one-dimensional nonlinear wave equation“, submitted to Analysis & PDE.  This paper, to our knowledge, is the first paper to analyse the asymptotic behaviour of the one-dimensional defocusing nonlinear wave equation

${}-u_{tt}+u_{xx} = |u|^{p-1} u$ (1)

where $u: {\bf R} \times {\bf R} \to {\bf R}$ is the solution and $p>1$ is a fixed exponent.  Nowadays, this type of equation is considered a very simple example of a non-linear wave equation (there is only one spatial dimension, the equation is semilinear, the conserved energy is positive definite and coercive, and there are no derivatives in the nonlinear term), and indeed it is not difficult to show that any solution whose conserved energy

$E[u] := \int_{{\bf R}} \frac{1}{2} |u_t|^2 + \frac{1}{2} |u_x|^2 + \frac{1}{p+1} |u|^{p+1}\ dx$

is finite, will exist globally for all time (and remain finite energy, of course).  In particular, from the one-dimensional Gagliardo-Nirenberg inequality (a variant of the Sobolev embedding theorem), such solutions will remain uniformly bounded in $L^\infty_x({\bf R})$ for all time.

However, this leaves open the question of the asymptotic behaviour of such solutions in the limit as $t \to \infty$.  In higher dimensions, there are a variety of scattering and asymptotic completeness results which show that solutions to nonlinear wave equations such as (1) decay asymptotically in various senses, at least if one is in the perturbative regime in which the solution is assumed small in some sense (e.g. small energy).  For instance, a typical result might be that spatial norms such as $\|u(t)\|_{L^q({\bf R})}$ might go to zero (in an average sense, at least).   In general, such results for nonlinear wave equations are ultimately based on the fact that the linear wave equation in higher dimensions also enjoys an analogous decay as $t \to +\infty$, as linear waves in higher dimensions spread out and disperse over time.  (This can be formalised by decay estimates on the fundamental solution of the linear wave equation, or by basic estimates such as the (long-time) Strichartz estimates and their relatives.)  The idea is then to view the nonlinear wave equation as a perturbation of the linear one.

On the other hand, the solution to the linear one-dimensional wave equation

$-u_{tt} + u_{xx} = 0$ (2)

does not exhibit any decay in time; as one learns in an undergraduate PDE class, the general (finite energy) solution to such an equation is given by the superposition of two travelling waves,

$u(t,x) = f(x+t) + g(x-t)$ (3)

where $f$ and $g$ also have finite energy, so in particular norms such as $\|u(t)\|_{L^\infty_x({\bf R})}$ cannot decay to zero as $t \to \infty$ unless the solution is completely trivial.

Nevertheless, we were able to establish a nonlinear decay effect for equation (1), caused more by the nonlinear right-hand side of (1) than by the linear left-hand side, to obtain $L^\infty_x({\bf R})$ decay on the average:

Theorem 1. (Average $L^\infty_x$ decay) If $u$ is a finite energy solution to (1), then $\frac{1}{2T} \int_{-T}^T \|u(t)\|_{L^\infty_x({\bf R})}$ tends to zero as $T \to \infty$.

Actually we prove a slightly stronger statement than Theorem 1, in that the decay is uniform among all solutions with a given energy bound, but I will stick to the above formulation of the main result for simplicity.

Informally, the reason for the nonlinear decay is as follows.  The linear evolution tries to force waves to move at constant velocity (indeed, from (3) we see that linear waves move at the speed of light $c=1$).  But the defocusing nature of the nonlinearity will spread out any wave that is propagating along a constant velocity worldline.  This intuition can be formalised by a Morawetz-type energy estimate that shows that the nonlinear potential energy must decay along any rectangular slab of spacetime (that represents the neighbourhood of a constant velocity worldline).

Now, just because the linear wave equation propagates along constant velocity worldlines, this does not mean that the nonlinear wave equation does too; one could imagine that a wave packet could propagate along a more complicated trajectory $t \mapsto x(t)$ in which the velocity $x'(t)$ is not constant.  However, energy methods still force the solution of the nonlinear wave equation to obey finite speed of propagation, which in the wave packet context means (roughly speaking) that the nonlinear trajectory $t \mapsto x(t)$ is a Lipschitz continuous function (with Lipschitz constant at most $1$).

And now we deploy a trick which appears to be new to the field of nonlinear wave equations: we invoke the Rademacher differentiation theorem (or Lebesgue differentiation theorem), which asserts that Lipschitz continuous functions are almost everywhere differentiable.  (By coincidence, I am teaching this theorem in my current course, both in one dimension (which is the case of interest here) and in higher dimensions.)  A compactness argument allows one to extract a quantitative estimate from this theorem (cf. this earlier blog post of mine) which, roughly speaking, tells us that there are large portions of the trajectory $t \mapsto x(t)$ which behave approximately linearly at an appropriate scale.  This turns out to be a good enough control on the trajectory that one can apply the Morawetz inequality and rule out the existence of persistent wave packets over long periods of time, which is what leads to Theorem 1.

There is still scope for further work to be done on the asymptotics.  In particular, we still do not have a good understanding of what the asymptotic profile of the solution should be, even in the perturbative regime; standard nonlinear geometric optics methods do not appear to work very well due to the extremely weak decay.