You are currently browsing the category archive for the ‘math.AP’ category.

The wave equation is usually expressed in the form

\displaystyle  \partial_{tt} u - \Delta u = 0

where {u \colon {\bf R} \times {\bf R}^d \rightarrow {\bf C}} is a function of both time {t \in {\bf R}} and space {x \in {\bf R}^d}, with {\Delta} being the Laplacian operator. One can generalise this equation in a number of ways, for instance by replacing the spatial domain {{\bf R}^d} with some other manifold and replacing the Laplacian {\Delta} with the Laplace-Beltrami operator or adding lower order terms (such as a potential, or a coupling with a magnetic field). But for sake of discussion let us work with the classical wave equation on {{\bf R}^d}. We will work formally in this post, being unconcerned with issues of convergence, justifying interchange of integrals, derivatives, or limits, etc.. One then has a conserved energy

\displaystyle  \int_{{\bf R}^d} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx

which we can rewrite using integration by parts and the {L^2} inner product {\langle, \rangle} on {{\bf R}^d} as

\displaystyle  \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle \partial_t u(t), \partial_t u(t) \rangle.

A key feature of the wave equation is finite speed of propagation: if, at time {t=0} (say), the initial position {u(0)} and initial velocity {\partial_t u(0)} are both supported in a ball {B(x_0,R) := \{ x \in {\bf R}^d: |x-x_0| \leq R \}}, then at any later time {t>0}, the position {u(t)} and velocity {\partial_t u(t)} are supported in the larger ball {B(x_0,R+t)}. This can be seen for instance (formally, at least) by inspecting the exterior energy

\displaystyle  \int_{|x-x_0| > R+t} \frac{1}{2} |\nabla u(t,x)|^2 + \frac{1}{2} |\partial_t u(t,x)|^2\ dx

and observing (after some integration by parts and differentiation under the integral sign) that it is non-increasing in time, non-negative, and vanishing at time {t=0}.

The wave equation is second order in time, but one can turn it into a first order system by working with the pair {(u(t),v(t))} rather than just the single field {u(t)}, where {v(t) := \partial_t u(t)} is the velocity field. The system is then

\displaystyle  \partial_t u(t) = v(t)

\displaystyle  \partial_t v(t) = \Delta u(t)

and the conserved energy is now

\displaystyle  \frac{1}{2} \langle -\Delta u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (1)

Finite speed of propagation then tells us that if {u(0),v(0)} are both supported on {B(x_0,R)}, then {u(t),v(t)} are supported on {B(x_0,R+t)} for all {t>0}. One also has time reversal symmetry: if {t \mapsto (u(t),v(t))} is a solution, then {t \mapsto (u(-t), -v(-t))} is a solution also, thus for instance one can establish an analogue of finite speed of propagation for negative times {t<0} using this symmetry.

If one has an eigenfunction

\displaystyle  -\Delta \phi = \lambda^2 \phi

of the Laplacian, then we have the explicit solutions

\displaystyle  u(t) = e^{\pm it \lambda} \phi

\displaystyle  v(t) = \pm i \lambda e^{\pm it \lambda} \phi

of the wave equation, which formally can be used to construct all other solutions via the principle of superposition.

When one has vanishing initial velocity {v(0)=0}, the solution {u(t)} is given via functional calculus by

\displaystyle  u(t) = \cos(t \sqrt{-\Delta}) u(0)

and the propagator {\cos(t \sqrt{-\Delta})} can be expressed as the average of half-wave operators:

\displaystyle  \cos(t \sqrt{-\Delta}) = \frac{1}{2} ( e^{it\sqrt{-\Delta}} + e^{-it\sqrt{-\Delta}} ).

One can view {\cos(t \sqrt{-\Delta} )} as a minor of the full wave propagator

\displaystyle  U(t) := \exp \begin{pmatrix} 0 & t \\ t\Delta & 0 \end{pmatrix}

\displaystyle  = \begin{pmatrix} \cos(t \sqrt{-\Delta}) & \frac{\sin(t\sqrt{-\Delta})}{\sqrt{-\Delta}} \\ \sin(t\sqrt{-\Delta}) \sqrt{-\Delta} & \cos(t \sqrt{-\Delta} ) \end{pmatrix}

which is unitary with respect to the energy form (1), and is the fundamental solution to the wave equation in the sense that

\displaystyle  \begin{pmatrix} u(t) \\ v(t) \end{pmatrix} = U(t) \begin{pmatrix} u(0) \\ v(0) \end{pmatrix}. \ \ \ \ \ (2)

Viewing the contraction {\cos(t\sqrt{-\Delta})} as a minor of a unitary operator is an instance of the “dilation trick“.

It turns out (as I learned from Yuval Peres) that there is a useful discrete analogue of the wave equation (and of all of the above facts), in which the time variable {t} now lives on the integers {{\bf Z}} rather than on {{\bf R}}, and the spatial domain can be replaced by discrete domains also (such as graphs). Formally, the system is now of the form

\displaystyle  u(t+1) = P u(t) + v(t) \ \ \ \ \ (3)

\displaystyle  v(t+1) = P v(t) - (1-P^2) u(t)

where {t} is now an integer, {u(t), v(t)} take values in some Hilbert space (e.g. {\ell^2} functions on a graph {G}), and {P} is some operator on that Hilbert space (which in applications will usually be a self-adjoint contraction). To connect this with the classical wave equation, let us first consider a rescaling of this system

\displaystyle  u(t+\varepsilon) = P_\varepsilon u(t) + \varepsilon v(t)

\displaystyle  v(t+\varepsilon) = P_\varepsilon v(t) - \frac{1}{\varepsilon} (1-P_\varepsilon^2) u(t)

where {\varepsilon>0} is a small parameter (representing the discretised time step), {t} now takes values in the integer multiples {\varepsilon {\bf Z}} of {\varepsilon}, and {P_\varepsilon} is the wave propagator operator {P_\varepsilon := \cos( \varepsilon \sqrt{-\Delta} )} or the heat propagator {P_\varepsilon := \exp( - \varepsilon^2 \Delta/2 )} (the two operators are different, but agree to fourth order in {\varepsilon}). One can then formally verify that the wave equation emerges from this rescaled system in the limit {\varepsilon \rightarrow 0}. (Thus, {P} is not exactly the direct analogue of the Laplacian {\Delta}, but can be viewed as something like {P_\varepsilon = 1 - \frac{\varepsilon^2}{2} \Delta + O( \varepsilon^4 )} in the case of small {\varepsilon}, or {P = 1 - \frac{1}{2}\Delta + O(\Delta^2)} if we are not rescaling to the small {\varepsilon} case. The operator {P} is sometimes known as the diffusion operator)

Assuming {P} is self-adjoint, solutions to the system (3) formally conserve the energy

\displaystyle  \frac{1}{2} \langle (1-P^2) u(t), u(t) \rangle + \frac{1}{2} \langle v(t), v(t) \rangle. \ \ \ \ \ (4)

This energy is positive semi-definite if {P} is a contraction. We have the same time reversal symmetry as before: if {t \mapsto (u(t),v(t))} solves the system (3), then so does {t \mapsto (u(-t), -v(-t))}. If one has an eigenfunction

\displaystyle  P \phi = \cos(\lambda) \phi

to the operator {P}, then one has an explicit solution

\displaystyle  u(t) = e^{\pm it \lambda} \phi

\displaystyle  v(t) = \pm i \sin(\lambda) e^{\pm it \lambda} \phi

to (3), and (in principle at least) this generates all other solutions via the principle of superposition.

Finite speed of propagation is a lot easier in the discrete setting, though one has to offset the support of the “velocity” field {v} by one unit. Suppose we know that {P} has unit speed in the sense that whenever {f} is supported in a ball {B(x,R)}, then {Pf} is supported in the ball {B(x,R+1)}. Then an easy induction shows that if {u(0), v(0)} are supported in {B(x_0,R), B(x_0,R+1)} respectively, then {u(t), v(t)} are supported in {B(x_0,R+t), B(x_0, R+t+1)}.

The fundamental solution {U(t) = U^t} to the discretised wave equation (3), in the sense of (2), is given by the formula

\displaystyle  U(t) = U^t = \begin{pmatrix} P & 1 \\ P^2-1 & P \end{pmatrix}^t

\displaystyle  = \begin{pmatrix} T_t(P) & U_{t-1}(P) \\ (P^2-1) U_{t-1}(P) & T_t(P) \end{pmatrix}

where {T_t} and {U_t} are the Chebyshev polynomials of the first and second kind, thus

\displaystyle  T_t( \cos \theta ) = \cos(t\theta)

and

\displaystyle  U_t( \cos \theta ) = \frac{\sin((t+1)\theta)}{\sin \theta}.

In particular, {P} is now a minor of {U(1) = U}, and can also be viewed as an average of {U} with its inverse {U^{-1}}:

\displaystyle  \begin{pmatrix} P & 0 \\ 0 & P \end{pmatrix} = \frac{1}{2} (U + U^{-1}). \ \ \ \ \ (5)

As before, {U} is unitary with respect to the energy form (4), so this is another instance of the dilation trick in action. The powers {P^n} and {U^n} are discrete analogues of the heat propagators {e^{t\Delta/2}} and wave propagators {U(t)} respectively.

One nice application of all this formalism, which I learned from Yuval Peres, is the Varopoulos-Carne inequality:

Theorem 1 (Varopoulos-Carne inequality) Let {G} be a (possibly infinite) regular graph, let {n \geq 1}, and let {x, y} be vertices in {G}. Then the probability that the simple random walk at {x} lands at {y} at time {n} is at most {2 \exp( - d(x,y)^2 / 2n )}, where {d} is the graph distance.

This general inequality is quite sharp, as one can see using the standard Cayley graph on the integers {{\bf Z}}. Very roughly speaking, it asserts that on a regular graph of reasonably controlled growth (e.g. polynomial growth), random walks of length {n} concentrate on the ball of radius {O(\sqrt{n})} or so centred at the origin of the random walk.

Proof: Let {P \colon \ell^2(G) \rightarrow \ell^2(G)} be the graph Laplacian, thus

\displaystyle  Pf(x) = \frac{1}{D} \sum_{y \sim x} f(y)

for any {f \in \ell^2(G)}, where {D} is the degree of the regular graph and sum is over the {D} vertices {y} that are adjacent to {x}. This is a contraction of unit speed, and the probability that the random walk at {x} lands at {y} at time {n} is

\displaystyle  \langle P^n \delta_x, \delta_y \rangle

where {\delta_x, \delta_y} are the Dirac deltas at {x,y}. Using (5), we can rewrite this as

\displaystyle  \langle (\frac{1}{2} (U + U^{-1}))^n \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle

where we are now using the energy form (4). We can write

\displaystyle  (\frac{1}{2} (U + U^{-1}))^n = {\bf E} U^{S_n}

where {S_n} is the simple random walk of length {n} on the integers, that is to say {S_n = \xi_1 + \dots + \xi_n} where {\xi_1,\dots,\xi_n = \pm 1} are independent uniform Bernoulli signs. Thus we wish to show that

\displaystyle  {\bf E} \langle U^{S_n} \begin{pmatrix} 0 \\ \delta_x\end{pmatrix}, \begin{pmatrix} 0 \\ \delta_y\end{pmatrix} \rangle \leq 2 \exp(-d(x,y)^2 / 2n ).

By finite speed of propagation, the inner product here vanishes if {|S_n| < d(x,y)}. For {|S_n| \geq d(x,y)} we can use Cauchy-Schwarz and the unitary nature of {U} to bound the inner product by {1}. Thus the left-hand side may be upper bounded by

\displaystyle  {\bf P}( |S_n| \geq d(x,y) )

and the claim now follows from the Chernoff inequality. \Box

This inequality has many applications, particularly with regards to relating the entropy, mixing time, and concentration of random walks with volume growth of balls; see this text of Lyons and Peres for some examples.

For sake of comparison, here is a continuous counterpart to the Varopoulos-Carne inequality:

Theorem 2 (Continuous Varopoulos-Carne inequality) Let {t > 0}, and let {f,g \in L^2({\bf R}^d)} be supported on compact sets {F,G} respectively. Then

\displaystyle  |\langle e^{t\Delta/2} f, g \rangle| \leq \sqrt{\frac{2t}{\pi d(F,G)^2}} \exp( - d(F,G)^2 / 2t ) \|f\|_{L^2} \|g\|_{L^2}

where {d(F,G)} is the Euclidean distance between {F} and {G}.

Proof: By Fourier inversion one has

\displaystyle  e^{-t\xi^2/2} = \frac{1}{\sqrt{2\pi t}} \int_{\bf R} e^{-s^2/2t} e^{is\xi}\ ds

\displaystyle  = \sqrt{\frac{2}{\pi t}} \int_0^\infty e^{-s^2/2t} \cos(s \xi )\ ds

for any real {\xi}, and thus

\displaystyle  \langle e^{t\Delta/2} f, g\rangle = \sqrt{\frac{2}{\pi}} \int_0^\infty e^{-s^2/2t} \langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds.

By finite speed of propagation, the inner product {\langle \cos(s \sqrt{-\Delta} ) f, g \rangle\ ds} vanishes when {s < d(F,G)}; otherwise, we can use Cauchy-Schwarz and the contractive nature of {\cos(s \sqrt{-\Delta} )} to bound this inner product by {\|f\|_{L^2} \|g\|_{L^2}}. Thus

\displaystyle  |\langle e^{t\Delta/2} f, g\rangle| \leq \sqrt{\frac{2}{\pi t}} \|f\|_{L^2} \|g\|_{L^2} \int_{d(F,G)}^\infty e^{-s^2/2t}\ ds.

Bounding {e^{-s^2/2t}} by {e^{-d(F,G)^2/2t} e^{-d(F,G) (s-d(F,G))/t}}, we obtain the claim. \Box

Observe that the argument is quite general and can be applied for instance to other Riemannian manifolds than {{\bf R}^d}.

Many fluid equations are expected to exhibit turbulence in their solutions, in which a significant portion of their energy ends up in high frequency modes. A typical example arises from the three-dimensional periodic Navier-Stokes equations

\displaystyle  \partial_t u + u \cdot \nabla u = \nu \Delta u + \nabla p + f

\displaystyle  \nabla \cdot u = 0

where {u: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3} is the velocity field, {f: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3} is a forcing term, {p: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}} is a pressure field, and {\nu > 0} is the viscosity. To study the dynamics of energy for this system, we first pass to the Fourier transform

\displaystyle  \hat u(t,k) := \int_{{\bf R}^3/{\bf Z}^3} u(t,x) e^{-2\pi i k \cdot x}

so that the system becomes

\displaystyle  \partial_t \hat u(t,k) + 2\pi \sum_{k = k_1 + k_2} (\hat u(t,k_1) \cdot ik_2) \hat u(t,k_2) =

\displaystyle  - 4\pi^2 \nu |k|^2 \hat u(t,k) + 2\pi ik \hat p(t,k) + \hat f(t,k) \ \ \ \ \ (1)

\displaystyle  k \cdot \hat u(t,k) = 0.

We may normalise {u} (and {f}) to have mean zero, so that {\hat u(t,0)=0}. Then we introduce the dyadic energies

\displaystyle  E_N(t) := \sum_{|k| \sim N} |\hat u(t,k)|^2

where {N \geq 1} ranges over the powers of two, and {|k| \sim N} is shorthand for {N \leq |k| < 2N}. Taking the inner product of (1) with {\hat u(t,k)}, we obtain the energy flow equation

\displaystyle  \partial_t E_N = \sum_{N_1,N_2} \Pi_{N,N_1,N_2} - D_N + F_N \ \ \ \ \ (2)

where {N_1,N_2} range over powers of two, {\Pi_{N,N_1,N_2}} is the energy flow rate

\displaystyle  \Pi_{N,N_1,N_2} := -2\pi \sum_{k=k_1+k_2: |k| \sim N, |k_1| \sim N_1, |k_2| \sim N_2}

\displaystyle  (\hat u(t,k_1) \cdot ik_2) (\hat u(t,k) \cdot \hat u(t,k_2)),

{D_N} is the energy dissipation rate

\displaystyle  D_N := 4\pi^2 \nu \sum_{|k| \sim N} |k|^2 |\hat u(t,k)|^2

and {F_N} is the energy injection rate

\displaystyle  F_N := \sum_{|k| \sim N} \hat u(t,k) \cdot \hat f(t,k).

The Navier-Stokes equations are notoriously difficult to solve in general. Despite this, Kolmogorov in 1941 was able to give a convincing heuristic argument for what the distribution of the dyadic energies {E_N} should become over long times, assuming that some sort of distributional steady state is reached. It is common to present this argument in the form of dimensional analysis, but one can also give a more “first principles” form Kolmogorov’s argument, which I will do here. Heuristically, one can divide the frequency scales {N} into three regimes:

  • The injection regime in which the energy injection rate {F_N} dominates the right-hand side of (2);
  • The energy flow regime in which the flow rates {\Pi_{N,N_1,N_2}} dominate the right-hand side of (2); and
  • The dissipation regime in which the dissipation {D_N} dominates the right-hand side of (2).

If we assume a fairly steady and smooth forcing term {f}, then {\hat f} will be supported on the low frequency modes {k=O(1)}, and so we heuristically expect the injection regime to consist of the low scales {N=O(1)}. Conversely, if we take the viscosity {\nu} to be small, we expect the dissipation regime to only occur for very large frequencies {N}, with the energy flow regime occupying the intermediate frequencies.

We can heuristically predict the dividing line between the energy flow regime. Of all the flow rates {\Pi_{N,N_1,N_2}}, it turns out in practice that the terms in which {N_1,N_2 = N+O(1)} (i.e., interactions between comparable scales, rather than widely separated scales) will dominate the other flow rates, so we will focus just on these terms. It is convenient to return back to physical space, decomposing the velocity field {u} into Littlewood-Paley components

\displaystyle  u_N(t,x) := \sum_{|k| \sim N} \hat u(t,k) e^{2\pi i k \cdot x}

of the velocity field {u(t,x)} at frequency {N}. By Plancherel’s theorem, this field will have an {L^2} norm of {E_N(t)^{1/2}}, and as a naive model of turbulence we expect this field to be spread out more or less uniformly on the torus, so we have the heuristic

\displaystyle  |u_N(t,x)| = O( E_N(t)^{1/2} ),

and a similar heuristic applied to {\nabla u_N} gives

\displaystyle  |\nabla u_N(t,x)| = O( N E_N(t)^{1/2} ).

(One can consider modifications of the Kolmogorov model in which {u_N} is concentrated on a lower-dimensional subset of the three-dimensional torus, leading to some changes in the numerology below, but we will not consider such variants here.) Since

\displaystyle  \Pi_{N,N_1,N_2} = - \int_{{\bf R}^3/{\bf Z}^3} u_N \cdot ( (u_{N_1} \cdot \nabla) u_{N_2} )\ dx

we thus arrive at the heuristic

\displaystyle  \Pi_{N,N_1,N_2} = O( N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} ).

Of course, there is the possibility that due to significant cancellation, the energy flow is significantly less than {O( N E_N(t)^{3/2} )}, but we will assume that cancellation effects are not that significant, so that we typically have

\displaystyle  \Pi_{N,N_1,N_2} \sim N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} \ \ \ \ \ (3)

or (assuming that {E_N} does not oscillate too much in {N}, and {N_1,N_2} are close to {N})

\displaystyle  \Pi_{N,N_1,N_2} \sim N E_N^{3/2}.

On the other hand, we clearly have

\displaystyle  D_N \sim \nu N^2 E_N.

We thus expect to be in the dissipation regime when

\displaystyle  N \gtrsim \nu^{-1} E_N^{1/2} \ \ \ \ \ (4)

and in the energy flow regime when

\displaystyle  1 \lesssim N \lesssim \nu^{-1} E_N^{1/2}. \ \ \ \ \ (5)

Now we study the energy flow regime further. We assume a “statistically scale-invariant” dynamics in this regime, in particular assuming a power law

\displaystyle  E_N \sim A N^{-\alpha} \ \ \ \ \ (6)

for some {A,\alpha > 0}. From (3), we then expect an average asymptotic of the form

\displaystyle  \Pi_{N,N_1,N_2} \approx A^{3/2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \ \ \ \ \ (7)

for some structure constants {c_{N,N_1,N_2} \sim 1} that depend on the exact nature of the turbulence; here we have replaced the factor {N_2} by the comparable term {(N N_1 N_2)^{1/3}} to make things more symmetric. In order to attain a steady state in the energy flow regime, we thus need a cancellation in the structure constants:

\displaystyle  \sum_{N_1,N_2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \approx 0. \ \ \ \ \ (8)

On the other hand, if one is assuming statistical scale invariance, we expect the structure constants to be scale-invariant (in the energy flow regime), in that

\displaystyle  c_{\lambda N, \lambda N_1, \lambda N_2} = c_{N,N_1,N_2} \ \ \ \ \ (9)

for dyadic {\lambda > 0}. Also, since the Euler equations conserve energy, the energy flows {\Pi_{N,N_1,N_2}} symmetrise to zero,

\displaystyle  \Pi_{N,N_1,N_2} + \Pi_{N,N_2,N_1} + \Pi_{N_1,N,N_2} + \Pi_{N_1,N_2,N} + \Pi_{N_2,N,N_1} + \Pi_{N_2,N_1,N} = 0,

which from (7) suggests a similar cancellation among the structure constants

\displaystyle  c_{N,N_1,N_2} + c_{N,N_2,N_1} + c_{N_1,N,N_2} + c_{N_1,N_2,N} + c_{N_2,N,N_1} + c_{N_2,N_1,N} \approx 0.

Combining this with the scale-invariance (9), we see that for fixed {N}, we may organise the structure constants {c_{N,N_1,N_2}} for dyadic {N_1,N_2} into sextuples which sum to zero (including some degenerate tuples of order less than six). This will automatically guarantee the cancellation (8) required for a steady state energy distribution, provided that

\displaystyle  \frac{1}{3} - \frac{\alpha}{2} = 0

or in other words

\displaystyle  \alpha = \frac{2}{3};

for any other value of {\alpha}, there is no particular reason to expect this cancellation (8) to hold. Thus we are led to the heuristic conclusion that the most stable power law distribution for the energies {E_N} is the {2/3} law

\displaystyle  E_N \sim A N^{-2/3} \ \ \ \ \ (10)

or in terms of shell energies, we have the famous Kolmogorov 5/3 law

\displaystyle  \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim A k_0^{-5/3}.

Given that frequency interactions tend to cascade from low frequencies to high (if only because there are so many more high frequencies than low ones), the above analysis predicts a stablising effect around this power law: scales at which a law (6) holds for some {\alpha > 2/3} are likely to lose energy in the near-term, while scales at which a law (6) hold for some {\alpha< 2/3} are conversely expected to gain energy, thus nudging the exponent of power law towards {2/3}.

We can solve for {A} in terms of energy dissipation as follows. If we let {N_*} be the frequency scale demarcating the transition from the energy flow regime (5) to the dissipation regime (4), we have

\displaystyle  N_* \sim \nu^{-1} E_{N_*}

and hence by (10)

\displaystyle  N_* \sim \nu^{-1} A N_*^{-2/3}.

On the other hand, if we let {\epsilon := D_{N_*}} be the energy dissipation at this scale {N_*} (which we expect to be the dominant scale of energy dissipation), we have

\displaystyle  \epsilon \sim \nu N_*^2 E_N \sim \nu N_*^2 A N_*^{-2/3}.

Some simple algebra then lets us solve for {A} and {N_*} as

\displaystyle  N_* \sim (\frac{\epsilon}{\nu^3})^{1/4}

and

\displaystyle  A \sim \epsilon^{2/3}.

Thus, we have the Kolmogorov prediction

\displaystyle  \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim \epsilon^{2/3} k_0^{-5/3}

for

\displaystyle  1 \lesssim k_0 \lesssim (\frac{\epsilon}{\nu^3})^{1/4}

with energy dissipation occuring at the high end {k_0 \sim (\frac{\epsilon}{\nu^3})^{1/4}} of this scale, which is counterbalanced by the energy injection at the low end {k_0 \sim 1} of the scale.

As in the previous post, all computations here are at the formal level only.

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with Lagrangian space {{\cal L} = ({\bf R}^n, \hbox{vol})} and Eulerian space {{\cal E} = ({\bf R}^n, \eta, \hbox{vol})}, we let {M} be the space of volume-preserving, orientation-preserving maps {\Phi: {\cal L} \rightarrow {\cal E}} from Lagrangian space to Eulerian space. Given a curve {\Phi: {\bf R} \rightarrow M}, we can define the Lagrangian velocity field {\dot \Phi: {\bf R} \times {\cal L} \rightarrow T{\cal E}} as the time derivative of {\Phi}, and the Eulerian velocity field {u := \dot \Phi \circ \Phi^{-1}: {\bf R} \times {\cal E} \rightarrow T{\cal E}}. The volume-preserving nature of {\Phi} ensures that {u} is a divergence-free vector field:

\displaystyle  \nabla \cdot u = 0. \ \ \ \ \ (1)

If we formally define the functional

\displaystyle  J[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} |u(t,x)|^2\ dx dt = \frac{1}{2} \int_R \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx dt

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

\displaystyle  [\partial_t + u \cdot \nabla] u = - \nabla p

\displaystyle  \nabla \cdot u = 0

for some pressure field {p: {\bf R} \times {\cal E} \rightarrow {\bf R}}. As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

\displaystyle  \frac{1}{2} \int_{{\cal E}} |u(t,x)|^2\ dx = \frac{1}{2} \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx;

the rigid motion symmetries of Eulerian space give conservation of the total momentum

\displaystyle  \int_{{\cal E}} u(t,x)\ dx

and total angular momentum

\displaystyle  \int_{{\cal E}} x \wedge u(t,x)\ dx;

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

\displaystyle  \int_{\Phi(\gamma)} u^*

for any closed loop {\gamma} in {{\cal L}}, or equivalently pointwise conservation of the Lagrangian vorticity {\Phi^* \omega = \Phi^* du^*}, where {u^*} is the {1}-form associated with the vector field {u} using the Euclidean metric {\eta} on {{\cal E}}, with {\Phi^*} denoting pullback by {\Phi}.

It turns out that one can generalise the above calculations. Given any self-adjoint operator {A} on divergence-free vector fields {u: {\cal E} \rightarrow {\bf R}}, we can define the functional

\displaystyle  J_A[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx dt;

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

\displaystyle  [\partial_t + u \cdot \nabla] Au + (\nabla u) \cdot Au= - \nabla \tilde p \ \ \ \ \ (2)

\displaystyle  \nabla \cdot u = 0

for some pressure field {\tilde p: {\bf R} \times {\cal E} \rightarrow {\bf R}}, where {(\nabla u) \cdot Au} in coordinates is {\partial_i u_j Au_j} with the usual summation conventions. (When {A=1}, {(\nabla u) \cdot Au = \nabla(\frac{1}{2} |u|^2)}, and this term can be absorbed into the pressure {\tilde p}, and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

\displaystyle  \frac{1}{2} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx.

If the operator {A} commutes with rigid motions on {{\cal E}}, then we have conservation of total momentum

\displaystyle  \int_{{\cal E}} Au(t,x)\ dx

and total angular momentum

\displaystyle  \int_{{\cal E}} x \wedge Au(t,x)\ dx,

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

\displaystyle  \int_{\Phi(\gamma)} (Au)^*

or pointwise conservation of the Lagrangian vorticity {\Phi^* \theta := \Phi^* d(Au)^*}. These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions {n=2}, when {A} is the inverse derivative {A = |\nabla|^{-1} = (-\Delta)^{-1/2}}. The vorticity {\theta = d(Au)^*} is a {2}-form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write {u = (u_1,u_2)}, then

\displaystyle  \theta = \partial_{x_1} |\nabla|^{-1} u_2 - \partial_{x_2} |\nabla|^{-1} u_1.

Since {u} is also divergence-free, we may therefore write

\displaystyle  u = (- \partial_{x_2} \psi, \partial_{x_1} \psi )

where the stream function {\psi} is given by the formula

\displaystyle  \psi = |\nabla|^{-1} \theta.

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

\displaystyle  [\partial_t + u \cdot \nabla] \theta = 0 \ \ \ \ \ (3)

\displaystyle  u = (-\partial_{x_2} |\nabla|^{-1} \theta, \partial_{x_1} |\nabla|^{-1} \theta).

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

\displaystyle  \frac{1}{2} \int_{{\bf R}^2} u\cdot |\nabla|^{-1} u\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta \psi\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta |\nabla|^{-1} \theta\ dx

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As {A} commutes with rigid motions, we also have (formally, at least) conservation of momentum

\displaystyle  \int_{{\bf R}^2} Au\ dx

(which up to trivial transformations is also expressible in impulse form as {\int_{{\bf R}^2} \theta x\ dx}, after integration by parts), and conservation of angular momentum

\displaystyle  \int_{{\bf R}^2} x \wedge Au\ dx

(which up to trivial transformations is {\int_{{\bf R}^2} \theta |x|^2\ dx}). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity {\Phi^* \theta}, thus {\theta} is transported by the flow (which is also evident from (3). In particular, all integrals of the form {\int F(\theta)\ dx} for a fixed function {F} are conserved by the flow.

Read the rest of this entry »

Throughout this post, we will work only at the formal level of analysis, ignoring issues of convergence of integrals, justifying differentiation under the integral sign, and so forth. (Rigorous justification of the conservation laws and other identities arising from the formal manipulations below can usually be established in an a posteriori fashion once the identities are in hand, without the need to rigorously justify the manipulations used to come up with these identities).

It is a remarkable fact in the theory of differential equations that many of the ordinary and partial differential equations that are of interest (particularly in geometric PDE, or PDE arising from mathematical physics) admit a variational formulation; thus, a collection {\Phi: \Omega \rightarrow M} of one or more fields on a domain {\Omega} taking values in a space {M} will solve the differential equation of interest if and only if {\Phi} is a critical point to the functional

\displaystyle  J[\Phi] := \int_\Omega L( x, \Phi(x), D\Phi(x) )\ dx \ \ \ \ \ (1)

involving the fields {\Phi} and their first derivatives {D\Phi}, where the Lagrangian {L: \Sigma \rightarrow {\bf R}} is a function on the vector bundle {\Sigma} over {\Omega \times M} consisting of triples {(x, q, \dot q)} with {x \in \Omega}, {q \in M}, and {\dot q: T_x \Omega \rightarrow T_q M} a linear transformation; we also usually keep the boundary data of {\Phi} fixed in case {\Omega} has a non-trivial boundary, although we will ignore these issues here. (We also ignore the possibility of having additional constraints imposed on {\Phi} and {D\Phi}, which require the machinery of Lagrange multipliers to deal with, but which will only serve as a distraction for the current discussion.) It is common to use local coordinates to parameterise {\Omega} as {{\bf R}^d} and {M} as {{\bf R}^n}, in which case {\Sigma} can be viewed locally as a function on {{\bf R}^d \times {\bf R}^n \times {\bf R}^{dn}}.

Example 1 (Geodesic flow) Take {\Omega = [0,1]} and {M = (M,g)} to be a Riemannian manifold, which we will write locally in coordinates as {{\bf R}^n} with metric {g_{ij}(q)} for {i,j=1,\dots,n}. A geodesic {\gamma: [0,1] \rightarrow M} is then a critical point (keeping {\gamma(0),\gamma(1)} fixed) of the energy functional

\displaystyle  J[\gamma] := \frac{1}{2} \int_0^1 g_{\gamma(t)}( D\gamma(t), D\gamma(t) )\ dt

or in coordinates (ignoring coordinate patch issues, and using the usual summation conventions)

\displaystyle  J[\gamma] = \frac{1}{2} \int_0^1 g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t)\ dt.

As discussed in this previous post, both the Euler equations for rigid body motion, and the Euler equations for incompressible inviscid flow, can be interpreted as geodesic flow (though in the latter case, one has to work really formally, as the manifold {M} is now infinite dimensional).

More generally, if {\Omega = (\Omega,h)} is itself a Riemannian manifold, which we write locally in coordinates as {{\bf R}^d} with metric {h_{ab}(x)} for {a,b=1,\dots,d}, then a harmonic map {\Phi: \Omega \rightarrow M} is a critical point of the energy functional

\displaystyle  J[\Phi] := \frac{1}{2} \int_\Omega h(x) \otimes g_{\gamma(x)}( D\gamma(x), D\gamma(x) )\ dh(x)

or in coordinates (again ignoring coordinate patch issues)

\displaystyle  J[\Phi] = \frac{1}{2} \int_{{\bf R}^d} h_{ab}(x) g_{ij}(\Phi(x)) (\partial_a \Phi^i(x)) (\partial_b \Phi^j(x))\ \sqrt{\det(h(x))}\ dx.

If we replace the Riemannian manifold {\Omega} by a Lorentzian manifold, such as Minkowski space {{\bf R}^{1+3}}, then the notion of a harmonic map is replaced by that of a wave map, which generalises the scalar wave equation (which corresponds to the case {M={\bf R}}).

Example 2 ({N}-particle interactions) Take {\Omega = {\bf R}} and {M = {\bf R}^3 \otimes {\bf R}^N}; then a function {\Phi: \Omega \rightarrow M} can be interpreted as a collection of {N} trajectories {q_1,\dots,q_N: {\bf R} \rightarrow {\bf R}^3} in space, which we give a physical interpretation as the trajectories of {N} particles. If we assign each particle a positive mass {m_1,\dots,m_N > 0}, and also introduce a potential energy function {V: M \rightarrow {\bf R}}, then it turns out that Newton’s laws of motion {F=ma} in this context (with the force {F_i} on the {i^{th}} particle being given by the conservative force {-\nabla_{q_i} V}) are equivalent to the trajectories {q_1,\dots,q_N} being a critical point of the action functional

\displaystyle  J[\Phi] := \int_{\bf R} \sum_{i=1}^N \frac{1}{2} m_i |\dot q_i(t)|^2 - V( q_1(t),\dots,q_N(t) )\ dt.

Formally, if {\Phi = \Phi_0} is a critical point of a functional {J[\Phi]}, this means that

\displaystyle  \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0

whenever {s \mapsto \Phi[s]} is a (smooth) deformation with {\Phi[0]=\Phi_0} (and with {\Phi[s]} respecting whatever boundary conditions are appropriate). Interchanging the derivative and integral, we (formally, at least) arrive at

\displaystyle  \int_\Omega \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}\ dx = 0. \ \ \ \ \ (2)

Write {\delta \Phi := \frac{d}{ds} \Phi[s]|_{s=0}} for the infinitesimal deformation of {\Phi_0}. By the chain rule, {\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}} can be expressed in terms of {x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}. In coordinates, we have

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \delta \Phi^i(x) L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) \ \ \ \ \ (3)

\displaystyle  + \partial_{x^a} \delta \Phi^i(x) L_{\partial_{x^a} q^i} (x,\Phi_0(x), D\Phi_0(x)),

where we parameterise {\Sigma} by {x, (q^i)_{i=1,\dots,n}, (\partial_{x^a} q^i)_{a=1,\dots,d; i=1,\dots,n}}, and we use subscripts on {L} to denote partial derivatives in the various coefficients. (One can of course work in a coordinate-free manner here if one really wants to, but the notation becomes a little cumbersome due to the need to carefully split up the tangent space of {\Sigma}, and we will not do so here.) Thus we can view (2) as an integral identity that asserts the vanishing of a certain integral, whose integrand involves {x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}, where {\delta \Phi} vanishes at the boundary but is otherwise unconstrained.

A general rule of thumb in PDE and calculus of variations is that whenever one has an integral identity of the form {\int_\Omega F(x)\ dx = 0} for some class of functions {F} that vanishes on the boundary, then there must be an associated differential identity {F = \hbox{div} X} that justifies this integral identity through Stokes’ theorem. This rule of thumb helps explain why integration by parts is used so frequently in PDE to justify integral identities. The rule of thumb can fail when one is dealing with “global” or “cohomologically non-trivial” integral identities of a topological nature, such as the Gauss-Bonnet or Kazhdan-Warner identities, but is quite reliable for “local” or “cohomologically trivial” identities, such as those arising from calculus of variations.

In any case, if we apply this rule to (2), we expect that the integrand {\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}} should be expressible as a spatial divergence. This is indeed the case:

Proposition 1 (Formal) Let {\Phi = \Phi_0} be a critical point of the functional {J[\Phi]} defined in (1). Then for any deformation {s \mapsto \Phi[s]} with {\Phi[0] = \Phi_0}, we have

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} X \ \ \ \ \ (4)

where {X} is the vector field that is expressible in coordinates as

\displaystyle  X^a := \delta \Phi^i(x) L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)). \ \ \ \ \ (5)

Proof: Comparing (4) with (3), we see that the claim is equivalent to the Euler-Lagrange equation

\displaystyle  L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) = 0. \ \ \ \ \ (6)

The same computation, together with an integration by parts, shows that (2) may be rewritten as

\displaystyle  \int_\Omega ( L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) ) \delta \Phi^i(x)\ dx = 0.

Since {\delta \Phi^i(x)} is unconstrained on the interior of {\Omega}, the claim (6) follows (at a formal level, at least). \Box

Many variational problems also enjoy one-parameter continuous symmetries: given any field {\Phi_0} (not necessarily a critical point), one can place that field in a one-parameter family {s \mapsto \Phi[s]} with {\Phi[0] = \Phi_0}, such that

\displaystyle  J[ \Phi[s] ] = J[ \Phi[0] ]

for all {s}; in particular,

\displaystyle  \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0,

which can be written as (2) as before. Applying the previous rule of thumb, we thus expect another divergence identity

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} Y \ \ \ \ \ (7)

whenever {s \mapsto \Phi[s]} arises from a continuous one-parameter symmetry. This expectation is indeed the case in many examples. For instance, if the spatial domain {\Omega} is the Euclidean space {{\bf R}^d}, and the Lagrangian (when expressed in coordinates) has no direct dependence on the spatial variable {x}, thus

\displaystyle  L( x, \Phi(x), D\Phi(x) ) = L( \Phi(x), D\Phi(x) ), \ \ \ \ \ (8)

then we obtain {d} translation symmetries

\displaystyle  \Phi[s](x) := \Phi(x - s e^a )

for {a=1,\dots,d}, where {e^1,\dots,e^d} is the standard basis for {{\bf R}^d}. For a fixed {a}, the left-hand side of (7) then becomes

\displaystyle  \frac{d}{ds} L( \Phi(x-se^a), D\Phi(x-se^a) )|_{s=0} = -\partial_{x^a} [ L( \Phi(x), D\Phi(x) ) ]

\displaystyle  = \hbox{div} Y

where {Y(x) = - L(\Phi(x), D\Phi(x)) e^a}. Another common type of symmetry is a pointwise symmetry, in which

\displaystyle  L( x, \Phi[s](x), D\Phi[s](x) ) = L( x, \Phi[0](x), D\Phi[0](x) ) \ \ \ \ \ (9)

for all {x}, in which case (7) clearly holds with {Y=0}.

If we subtract (4) from (7), we obtain the celebrated theorem of Noether linking symmetries with conservation laws:

Theorem 2 (Noether’s theorem) Suppose that {\Phi_0} is a critical point of the functional (1), and let {\Phi[s]} be a one-parameter continuous symmetry with {\Phi[0] = \Phi_0}. Let {X} be the vector field in (5), and let {Y} be the vector field in (7). Then we have the pointwise conservation law

\displaystyle  \hbox{div}(X-Y) = 0.

In particular, for one-dimensional variational problems, in which {\Omega \subset {\bf R}}, we have the conservation law {(X-Y)(t) = (X-Y)(0)} for all {t \in \Omega} (assuming of course that {\Omega} is connected and contains {0}).

Noether’s theorem gives a systematic way to locate conservation laws for solutions to variational problems. For instance, if {\Omega \subset {\bf R}} and the Lagrangian has no explicit time dependence, thus

\displaystyle  L(t, \Phi(t), \dot \Phi(t)) = L(\Phi(t), \dot \Phi(t)),

then by using the time translation symmetry {\Phi[s](t) := \Phi(t-s)}, we have

\displaystyle  Y(t) = - L( \Phi(t), \dot\Phi(t) )

as discussed previously, whereas we have {\delta \Phi(t) = - \dot \Phi(t)}, and hence by (5)

\displaystyle  X(t) := - \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t)),

and so Noether’s theorem gives conservation of the Hamiltonian

\displaystyle  H(t) := \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t))- L(\Phi(t), \dot \Phi(t)). \ \ \ \ \ (10)

For instance, for geodesic flow, the Hamiltonian works out to be

\displaystyle  H(t) = \frac{1}{2} g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t),

so we see that the speed of the geodesic is conserved over time.

For pointwise symmetries (9), {Y} vanishes, and so Noether’s theorem simplifies to {\hbox{div} X = 0}; in the one-dimensional case {\Omega \subset {\bf R}}, we thus see from (5) that the quantity

\displaystyle  \delta \Phi^i(t) L_{\dot q^i}(t,\Phi_0(t), \dot \Phi_0(t)) \ \ \ \ \ (11)

is conserved in time. For instance, for the {N}-particle system in Example 2, if we have the translation invariance

\displaystyle  V( q_1 + h, \dots, q_N + h ) = V( q_1, \dots, q_N )

for all {q_1,\dots,q_N,h \in {\bf R}^3}, then we have the pointwise translation symmetry

\displaystyle  q_i[s](t) := q_i(t) + s e^j

for all {i=1,\dots,N}, {s \in{\bf R}} and some {j=1,\dots,3}, in which case {\dot q_i(t) = e^j}, and the conserved quantity (11) becomes

\displaystyle  \sum_{i=1}^n m_i \dot q_i^j(t);

as {j=1,\dots,3} was arbitrary, this establishes conservation of the total momentum

\displaystyle  \sum_{i=1}^n m_i \dot q_i(t).

Similarly, if we have the rotation invariance

\displaystyle  V( R q_1, \dots, Rq_N ) = V( q_1, \dots, q_N )

for any {q_1,\dots,q_N \in {\bf R}^3} and {R \in SO(3)}, then we have the pointwise rotation symmetry

\displaystyle  q_i[s](t) := \exp( s A ) q_i(t)

for any skew-symmetric real {3 \times 3} matrix {A}, in which case {\dot q_i(t) = A q_i(t)}, and the conserved quantity (11) becomes

\displaystyle  \sum_{i=1}^n m_i \langle A q_i(t), \dot q_i(t) \rangle;

since {A} is an arbitrary skew-symmetric matrix, this establishes conservation of the total angular momentum

\displaystyle  \sum_{i=1}^n m_i q_i(t) \wedge \dot q_i(t).

Below the fold, I will describe how Noether’s theorem can be used to locate all of the conserved quantities for the Euler equations of inviscid fluid flow, discussed in this previous post, by interpreting that flow as geodesic flow in an infinite dimensional manifold.

Read the rest of this entry »

The Euler equations for incompressible inviscid fluids may be written as

\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p

\displaystyle \nabla \cdot u = 0

where {u: [0,T] \times {\bf R}^n \rightarrow {\bf R}^n} is the velocity field, and {p: [0,T] \times {\bf R}^n \rightarrow {\bf R}} is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that {u} is bounded. We will take the dimension {n} to be at least two, with the three-dimensional case {n=3} being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when {n=3} is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the conservation laws for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

\displaystyle \int_{{\bf R}^n} |u|^2\ dx \ \ \ \ \ (1)

 

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

\displaystyle \int_{{\bf R}^3} u \cdot (\nabla \times u)\ dx \ \ \ \ \ (2)

 

and (formally, at least) one has conservation of momentum

\displaystyle \int_{{\bf R}^3} u\ dx

and angular momentum

\displaystyle \int_{{\bf R}^3} x \times u\ dx

(although, as we shall discuss below, due to the slow decay of {u} at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

\displaystyle \int_{{\bf R}^3} \nabla \times u\ dx

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

Read the rest of this entry »

I’ve just uploaded to the arXiv the paper “Finite time blowup for an averaged three-dimensional Navier-Stokes equation“, submitted to J. Amer. Math. Soc.. The main purpose of this paper is to formalise the “supercriticality barrier” for the global regularity problem for the Navier-Stokes equation, which roughly speaking asserts that it is not possible to establish global regularity by any “abstract” approach which only uses upper bound function space estimates on the nonlinear part of the equation, combined with the energy identity. This is done by constructing a modification of the Navier-Stokes equations with a nonlinearity that obeys essentially all of the function space estimates that the true Navier-Stokes nonlinearity does, and which also obeys the energy identity, but for which one can construct solutions that blow up in finite time. Results of this type had been previously established by Montgomery-Smith, Gallagher-Paicu, and Li-Sinai for variants of the Navier-Stokes equation without the energy identity, and by Katz-Pavlovic and by Cheskidov for dyadic analogues of the Navier-Stokes equations in five and higher dimensions that obeyed the energy identity (see also the work of Plechac and Sverak and of Hou and Lei that also suggest blowup for other Navier-Stokes type models obeying the energy identity in five and higher dimensions), but to my knowledge this is the first blowup result for a Navier-Stokes type equation in three dimensions that also obeys the energy identity. Intriguingly, the method of proof in fact hints at a possible route to establishing blowup for the true Navier-Stokes equations, which I am now increasingly inclined to believe is the case (albeit for a very small set of initial data).

To state the results more precisely, recall that the Navier-Stokes equations can be written in the form

\displaystyle  \partial_t u + (u \cdot \nabla) u = \nu \Delta u + \nabla p

for a divergence-free velocity field {u} and a pressure field {p}, where {\nu>0} is the viscosity, which we will normalise to be one. We will work in the non-periodic setting, so the spatial domain is {{\bf R}^3}, and for sake of exposition I will not discuss matters of regularity or decay of the solution (but we will always be working with strong notions of solution here rather than weak ones). Applying the Leray projection {P} to divergence-free vector fields to this equation, we can eliminate the pressure, and obtain an evolution equation

\displaystyle  \partial_t u = \Delta u + B(u,u) \ \ \ \ \ (1)

purely for the velocity field, where {B} is a certain bilinear operator on divergence-free vector fields (specifically, {B(u,v) = -\frac{1}{2} P( (u \cdot \nabla) v + (v \cdot \nabla) u)}. The global regularity problem for Navier-Stokes is then equivalent to the global regularity problem for the evolution equation (1).

An important feature of the bilinear operator {B} appearing in (1) is the cancellation law

\displaystyle  \langle B(u,u), u \rangle = 0

(using the {L^2} inner product on divergence-free vector fields), which leads in particular to the fundamental energy identity

\displaystyle  \frac{1}{2} \int_{{\bf R}^3} |u(T,x)|^2\ dx + \int_0^T \int_{{\bf R}^3} |\nabla u(t,x)|^2\ dx dt = \frac{1}{2} \int_{{\bf R}^3} |u(0,x)|^2\ dx.

This identity (and its consequences) provide essentially the only known a priori bound on solutions to the Navier-Stokes equations from large data and arbitrary times. Unfortunately, as discussed in this previous post, the quantities controlled by the energy identity are supercritical with respect to scaling, which is the fundamental obstacle that has defeated all attempts to solve the global regularity problem for Navier-Stokes without any additional assumptions on the data or solution (e.g. perturbative hypotheses, or a priori control on a critical norm such as the {L^\infty_t L^3_x} norm).

Our main result is then (slightly informally stated) as follows

Theorem 1 There exists an averaged version {\tilde B} of the bilinear operator {B}, of the form

\displaystyle  \tilde B(u,v) := \int_\Omega m_{3,\omega}(D) Rot_{3,\omega}

\displaystyle B( m_{1,\omega}(D) Rot_{1,\omega} u, m_{2,\omega}(D) Rot_{2,\omega} v )\ d\mu(\omega)

for some probability space {(\Omega, \mu)}, some spatial rotation operators {Rot_{i,\omega}} for {i=1,2,3}, and some Fourier multipliers {m_{i,\omega}} of order {0}, for which one still has the cancellation law

\displaystyle  \langle \tilde B(u,u), u \rangle = 0

and for which the averaged Navier-Stokes equation

\displaystyle  \partial_t u = \Delta u + \tilde B(u,u) \ \ \ \ \ (2)

admits solutions that blow up in finite time.

(There are some integrability conditions on the Fourier multipliers {m_{i,\omega}} required in the above theorem in order for the conclusion to be non-trivial, but I am omitting them here for sake of exposition.)

Because spatial rotations and Fourier multipliers of order {0} are bounded on most function spaces, {\tilde B} automatically obeys almost all of the upper bound estimates that {B} does. Thus, this theorem blocks any attempt to prove global regularity for the true Navier-Stokes equations which relies purely on the energy identity and on upper bound estimates for the nonlinearity; one must use some additional structure of the nonlinear operator {B} which is not shared by an averaged version {\tilde B}. Such additional structure certainly exists – for instance, the Navier-Stokes equation has a vorticity formulation involving only differential operators rather than pseudodifferential ones, whereas a general equation of the form (2) does not. However, “abstract” approaches to global regularity generally do not exploit such structure, and thus cannot be used to affirmatively answer the Navier-Stokes problem.

It turns out that the particular averaged bilinear operator {B} that we will use will be a finite linear combination of local cascade operators, which take the form

\displaystyle  C(u,v) := \sum_{n \in {\bf Z}} (1+\epsilon_0)^{5n/2} \langle u, \psi_{1,n} \rangle \langle v, \psi_{2,n} \rangle \psi_{3,n}

where {\epsilon_0>0} is a small parameter, {\psi_1,\psi_2,\psi_3} are Schwartz vector fields whose Fourier transform is supported on an annulus, and {\psi_{i,n}(x) := (1+\epsilon_0)^{3n/2} \psi_i( (1+\epsilon_0)^n x)} is an {L^2}-rescaled version of {\psi_i} (basically a “wavelet” of wavelength about {(1+\epsilon_0)^{-n}} centred at the origin). Such operators were essentially introduced by Katz and Pavlovic as dyadic models for {B}; they have the essentially the same scaling property as {B} (except that one can only scale along powers of {1+\epsilon_0}, rather than over all positive reals), and in fact they can be expressed as an average of {B} in the sense of the above theorem, as can be shown after a somewhat tedious amount of Fourier-analytic symbol manipulations.

If we consider nonlinearities {\tilde B} which are a finite linear combination of local cascade operators, then the equation (2) more or less collapses to a system of ODE in certain “wavelet coefficients” of {u}. The precise ODE that shows up depends on what precise combination of local cascade operators one is using. Katz and Pavlovic essentially considered a single cascade operator together with its “adjoint” (needed to preserve the energy identity), and arrived (more or less) at the system of ODE

\displaystyle  \partial_t X_n = - (1+\epsilon_0)^{2n} X_n + (1+\epsilon_0)^{\frac{5}{2}(n-1)} X_{n-1}^2 - (1+\epsilon_0)^{\frac{5}{2} n} X_n X_{n+1} \ \ \ \ \ (3)

where {X_n: [0,T] \rightarrow {\bf R}} are scalar fields for each integer {n}. (Actually, Katz-Pavlovic worked with a technical variant of this particular equation, but the differences are not so important for this current discussion.) Note that the quadratic terms on the RHS carry a higher exponent of {1+\epsilon_0} than the dissipation term; this reflects the supercritical nature of this evolution (the energy {\frac{1}{2} \sum_n X_n^2} is monotone decreasing in this flow, so the natural size of {X_n} given the control on the energy is {O(1)}). There is a slight technical issue with the dissipation if one wishes to embed (3) into an equation of the form (2), but it is minor and I will not discuss it further here.

In principle, if the {X_n} mode has size comparable to {1} at some time {t_n}, then energy should flow from {X_n} to {X_{n+1}} at a rate comparable to {(1+\epsilon_0)^{\frac{5}{2} n}}, so that by time {t_{n+1} \approx t_n + (1+\epsilon_0)^{-\frac{5}{2} n}} or so, most of the energy of {X_n} should have drained into the {X_{n+1}} mode (with hardly any energy dissipated). Since the series {\sum_{n \geq 1} (1+\epsilon_0)^{-\frac{5}{2} n}} is summable, this suggests finite time blowup for this ODE as the energy races ever more quickly to higher and higher modes. Such a scenario was indeed established by Katz and Pavlovic (and refined by Cheskidov) if the dissipation strength {(1+\epsilon)^{2n}} was weakened somewhat (the exponent {2} has to be lowered to be less than {\frac{5}{3}}). As mentioned above, this is enough to give a version of Theorem 1 in five and higher dimensions.

On the other hand, it was shown a few years ago by Barbato, Morandin, and Romito that (3) in fact admits global smooth solutions (at least in the dyadic case {\epsilon_0=1}, and assuming non-negative initial data). Roughly speaking, the problem is that as energy is being transferred from {X_n} to {X_{n+1}}, energy is also simultaneously being transferred from {X_{n+1}} to {X_{n+2}}, and as such the solution races off to higher modes a bit too prematurely, without absorbing all of the energy from lower modes. This weakens the strength of the blowup to the point where the moderately strong dissipation in (3) is enough to kill the high frequency cascade before a true singularity occurs. Because of this, the original Katz-Pavlovic model cannot quite be used to establish Theorem 1 in three dimensions. (Actually, the original Katz-Pavlovic model had some additional dispersive features which allowed for another proof of global smooth solutions, which is an unpublished result of Nazarov.)

To get around this, I had to “engineer” an ODE system with similar features to (3) (namely, a quadratic nonlinearity, a monotone total energy, and the indicated exponents of {(1+\epsilon_0)} for both the dissipation term and the quadratic terms), but for which the cascade of energy from scale {n} to scale {n+1} was not interrupted by the cascade of energy from scale {n+1} to scale {n+2}. To do this, I needed to insert a delay in the cascade process (so that after energy was dumped into scale {n}, it would take some time before the energy would start to transfer to scale {n+1}), but the process also needed to be abrupt (once the process of energy transfer started, it needed to conclude very quickly, before the delayed transfer for the next scale kicked in). It turned out that one could build a “quadratic circuit” out of some basic “quadratic gates” (analogous to how an electrical circuit could be built out of basic gates such as amplifiers or resistors) that achieved this task, leading to an ODE system essentially of the form

\displaystyle \partial_t X_{1,n} = - (1+\epsilon_0)^{2n} X_{1,n}

\displaystyle  + (1+\epsilon_0)^{5n/2} (- \epsilon^{-2} X_{3,n} X_{4,n} - \epsilon X_{1,n} X_{2,n} - \epsilon^2 \exp(-K^{10}) X_{1,n} X_{3,n}

\displaystyle  + K X_{4,n-1}^2)

\displaystyle  \partial_t X_{2,n} = - (1+\epsilon_0)^{2n} X_{2,n} + (1+\epsilon_0)^{5n/2} (\epsilon X_{1,n}^2 - \epsilon^{-1} K^{10} X_{3,n}^2)

\displaystyle  \partial_t X_{3,n} = - (1+\epsilon_0)^{2n} X_{3,n} + (1+\epsilon_0)^{5n/2} (\epsilon^2 \exp(-K^{10}) X_{1,n}^2

\displaystyle + \epsilon^{-1} K^{10} X_{2,n} X_{3,n} )

\displaystyle  \partial_t X_{4,n} =- (1+\epsilon_0)^{2n} X_{4,n} + (1+\epsilon_0)^{5n/2} (\epsilon^{-2} X_{3,n} X_{1,n}

\displaystyle - (1+\epsilon_0)^{5/2} K X_{4,n} X_{1,n+1})

where {K \geq 1} is a suitable large parameter and {\epsilon > 0} is a suitable small parameter (much smaller than {1/K}). To visualise the dynamics of such a system, I found it useful to describe this system graphically by a “circuit diagram” that is analogous (but not identical) to the circuit diagrams arising in electrical engineering:

circuit-1

The coupling constants here range widely from being very large to very small; in practice, this makes the {X_{2,n}} and {X_{3,n}} modes absorb very little energy, but exert a sizeable influence on the remaining modes. If a lot of energy is suddenly dumped into {X_{1,n}}, what happens next is roughly as follows: for a moderate period of time, nothing much happens other than a trickle of energy into {X_{2,n}}, which in turn causes a rapid exponential growth of {X_{3,n}} (from a very low base). After this delay, {X_{3,n}} suddenly crosses a certain threshold, at which point it causes {X_{1,n}} and {X_{4,n}} to exchange energy back and forth with extreme speed. The energy from {X_{4,n}} then rapidly drains into {X_{1,n+1}}, and the process begins again (with a slight loss in energy due to the dissipation). If one plots the total energy {E_n := \frac{1}{2} ( X_{1,n}^2 + X_{2,n}^2 + X_{3,n}^2 + X_{4,n}^2 )} as a function of time, it looks schematically like this:

energy-blowup

As in the previous heuristic discussion, the time between cascades from one frequency scale to the next decay exponentially, leading to blowup at some finite time {T}. (One could describe the dynamics here as being similar to the famous “lighting the beacons” scene in the Lord of the Rings movies, except that (a) as each beacon gets ignited, the previous one is extinguished, as per the energy identity; (b) the time between beacon lightings decrease exponentially; and (c) there is no soundtrack.)

There is a real (but remote) possibility that this sort of construction can be adapted to the true Navier-Stokes equations. The basic blowup mechanism in the averaged equation is that of a von Neumann machine, or more precisely a construct (built within the laws of the inviscid evolution {\partial_t u = \tilde B(u,u)}) that, after some time delay, manages to suddenly create a replica of itself at a finer scale (and to largely erase its original instantiation in the process). In principle, such a von Neumann machine could also be built out of the laws of the inviscid form of the Navier-Stokes equations (i.e. the Euler equations). In physical terms, one would have to build the machine purely out of an ideal fluid (i.e. an inviscid incompressible fluid). If one could somehow create enough “logic gates” out of ideal fluid, one could presumably build a sort of “fluid computer”, at which point the task of building a von Neumann machine appears to reduce to a software engineering exercise rather than a PDE problem (providing that the gates are suitably stable with respect to perturbations, but (as with actual computers) this can presumably be done by converting the analog signals of fluid mechanics into a more error-resistant digital form). The key thing missing in this program (in both senses of the word) to establish blowup for Navier-Stokes is to construct the logic gates within the laws of ideal fluids. (Compare with the situation for cellular automata such as Conway’s “Game of Life“, in which Turing complete computers, universal constructors, and replicators have all been built within the laws of that game.)

The purpose of this post is to link to a short unpublished note of mine that I wrote back in 2010 but forgot to put on my web page at the time. Entitled “A physical space proof of the bilinear Strichartz and local smoothing estimates for the Schrodinger equation“, it gives a proof of two standard estimates for the free (linear) Schrodinger equation in flat Euclidean space, namely the bilinear Strichartz estimate and the local smoothing estimate, using primarily “physical space” methods such as integration by parts, instead of “frequency space” methods based on the Fourier transform, although a small amount of Fourier analysis (basically sectoral projection to make the Schrodinger waves move roughly in a given direction) is still needed.  This is somewhat in the spirit of an older paper of mine with Klainerman and Rodnianski doing something similar for the wave equation, and is also very similar to a paper of Planchon and Vega from 2009.  The hope was that by avoiding the finer properties of the Fourier transform, one could obtain a more robust argument which could also extend to nonlinear, non-free, or non-flat situations.   These notes were cited once or twice by some people that I had privately circulated them to, so I decided to put them online here for reference.

UPDATE, July 24: Fabrice Planchon has kindly supplied another note in which he gives a particularly simple proof of local smoothing in one dimension, and discusses some other variants of the method (related to the paper of Planchon and Vega cited earlier).

Consider the free Schrödinger equation in {d} spatial dimensions, which I will normalise as

\displaystyle  i u_t + \frac{1}{2} \Delta_{{\bf R}^d} u = 0 \ \ \ \ \ (1)

where {u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}} is the unknown field and {\Delta_{{\bf R}^{d+1}} = \sum_{j=1}^d \frac{\partial^2}{\partial x_j^2}} is the spatial Laplacian. To avoid irrelevant technical issues I will restrict attention to smooth (classical) solutions to this equation, and will work locally in spacetime avoiding issues of decay at infinity (or at other singularities); I will also avoid issues involving branch cuts of functions such as {t^{d/2}} (if one wishes, one can restrict {d} to be even in order to safely ignore all branch cut issues). The space of solutions to (1) enjoys a number of symmetries. A particularly non-obvious symmetry is the pseudoconformal symmetry: if {u} solves (1), then the pseudoconformal solution {pc(u): {\bf R} \times {\bf R}^d \rightarrow {\bf C}} defined by

\displaystyle  pc(u)(t,x) := \frac{1}{(it)^{d/2}} \overline{u(\frac{1}{t}, \frac{x}{t})} e^{i|x|^2/2t} \ \ \ \ \ (2)

for {t \neq 0} can be seen after some computation to also solve (1). (If {u} has suitable decay at spatial infinity and one chooses a suitable branch cut for {(it)^{d/2}}, one can extend {pc(u)} continuously to the {t=0} spatial slice, whereupon it becomes essentially the spatial Fourier transform of {u(0,\cdot)}, but we will not need this fact for the current discussion.)

An analogous symmetry exists for the free wave equation in {d+1} spatial dimensions, which I will write as

\displaystyle  u_{tt} - \Delta_{{\bf R}^{d+1}} u = 0 \ \ \ \ \ (3)

where {u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}} is the unknown field. In analogy to pseudoconformal symmetry, we have conformal symmetry: if {u: {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}} solves (3), then the function {conf(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}}, defined in the interior {\{ (t,x): |x| < |t| \}} of the light cone by the formula

\displaystyle  conf(u)(t,x) := (t^2-|x|^2)^{-d/2} u( \frac{t}{t^2-|x|^2}, \frac{x}{t^2-|x|^2} ), \ \ \ \ \ (4)

also solves (3).

There are also some direct links between the Schrödinger equation in {d} dimensions and the wave equation in {d+1} dimensions. This can be easily seen on the spacetime Fourier side: solutions to (1) have spacetime Fourier transform (formally) supported on a {d}-dimensional hyperboloid, while solutions to (3) have spacetime Fourier transform formally supported on a {d+1}-dimensional cone. To link the two, one then observes that the {d}-dimensional hyperboloid can be viewed as a conic section (i.e. hyperplane slice) of the {d+1}-dimensional cone. In physical space, this link is manifested as follows: if {u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}} solves (1), then the function {\iota_{1}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}} defined by

\displaystyle  \iota_{1}(u)(t,x_1,\ldots,x_{d+1}) := e^{-i(t+x_{d+1})} u( \frac{t-x_{d+1}}{2}, x_1,\ldots,x_d)

solves (3). More generally, for any non-zero scaling parameter {\lambda}, the function {\iota_{\lambda}(u): {\bf R} \times {\bf R}^{d+1} \rightarrow {\bf C}} defined by

\displaystyle  \iota_{\lambda}(u)(t,x_1,\ldots,x_{d+1}) :=

\displaystyle  \lambda^{d/2} e^{-i\lambda(t+x_{d+1})} u( \lambda \frac{t-x_{d+1}}{2}, \lambda x_1,\ldots,\lambda x_d) \ \ \ \ \ (5)

solves (3).

As an “extra challenge” posed in an exercise in one of my books (Exercise 2.28, to be precise), I asked the reader to use the embeddings {\iota_1} (or more generally {\iota_\lambda}) to explicitly connect together the pseudoconformal transformation {pc} and the conformal transformation {conf}. It turns out that this connection is a little bit unusual, with the “obvious” guess (namely, that the embeddings {\iota_\lambda} intertwine {pc} and {conf}) being incorrect, and as such this particular task was perhaps too difficult even for a challenge question. I’ve been asked a couple times to provide the connection more explicitly, so I will do so below the fold.

Read the rest of this entry »

[These are notes intended mostly for myself, as these topics are useful in random matrix theory, but may be of interest to some readers also. -T.]

One of the most fundamental partial differential equations in mathematics is the heat equation

\displaystyle  \partial_t f = L f \ \ \ \ \ (1)

where {f: [0,+\infty) \times {\bf R}^n \rightarrow {\bf R}} is a scalar function {(t,x) \mapsto f(t,x)} of both time and space, and {L} is the Laplacian {L := \frac{1}{2} \Delta = \sum_{i=1}^n \frac{\partial^2}{\partial x_i^2}}. For the purposes of this post, we will ignore all technical issues of regularity and decay, and always assume that the solutions to equations such as (1) have all the regularity and decay in order to justify all formal operations such as the chain rule, integration by parts, or differentiation under the integral sign. The factor of {\frac{1}{2}} in the definition of the heat propagator {L} is of course an arbitrary normalisation, chosen for some minor technical reasons; one can certainly continue the discussion below with other choices of normalisations if desired.

In probability theory, this equation takes on particular significance when {f} is restricted to be non-negative, and furthermore to be a probability measure at each time, in the sense that

\displaystyle  \int_{{\bf R}^n} f(t,x)\ dx = 1

for all {t}. (Actually, it suffices to verify this constraint at time {t=0}, as the heat equation (1) will then preserve this constraint.) Indeed, in this case, one can interpret {f(t,x)\ dx} as the probability distribution of a Brownian motion

\displaystyle  dx = dB(t) \ \ \ \ \ (2)

where {x = x(t) \in {\bf R}^n} is a stochastic process with initial probability distribution {f(0,x)\ dx}; see for instance this previous blog post for more discussion.

A model example of a solution to the heat equation to keep in mind is that of the fundamental solution

\displaystyle  G(t,x) = \frac{1}{(2\pi t)^{n/2}} e^{-|x|^2/2t} \ \ \ \ \ (3)

defined for any {t>0}, which represents the distribution of Brownian motion of a particle starting at the origin {x=0} at time {t=0}. At time {t}, {G(t,x)} represents an {{\bf R}^n}-valued random variable, each coefficient of which is an independent random variable of mean zero and variance {t}. (As {t \rightarrow 0^+}, {G(t)} converges in the sense of distributions to a Dirac mass at the origin.)

The heat equation can also be viewed as the gradient flow for the Dirichlet form

\displaystyle  D(f,g) := \frac{1}{2} \int_{{\bf R}^n} \nabla f \cdot \nabla g\ dx \ \ \ \ \ (4)

since one has the integration by parts identity

\displaystyle  \int_{{\bf R}^n} Lf(x) g(x)\ dx = \int_{{\bf R}^n} f(x) Lg(x)\ dx = - D(f,g) \ \ \ \ \ (5)

for all smooth, rapidly decreasing {f,g}, which formally implies that {L f} is (half of) the negative gradient of the Dirichlet energy {D(f,f) = \frac{1}{2} \int_{{\bf R}^n} |\nabla f|^2\ dx} with respect to the {L^2({\bf R}^n,dx)} inner product. Among other things, this implies that the Dirichlet energy decreases in time:

\displaystyle  \partial_t D(f,f) = - 2 \int_{{\bf R}^n} |Lf|^2\ dx. \ \ \ \ \ (6)

For instance, for the fundamental solution (3), one can verify for any time {t>0} that

\displaystyle  D(G,G) = \frac{n}{2^{n+2} \pi^{n/2}} \frac{1}{t^{(n+2)/2}} \ \ \ \ \ (7)

(assuming I have not made a mistake in the calculation). In a similar spirit we have

\displaystyle  \partial_t \int_{{\bf R}^n} |f|^2\ dx = - 2 D(f,f). \ \ \ \ \ (8)

Since {D(f,f)} is non-negative, the formula (6) implies that {\int_{{\bf R}^n} |Lf|^2\ dx} is integrable in time, and in particular we see that {Lf} converges to zero as {t \rightarrow \infty}, in some averaged {L^2} sense at least; similarly, (8) suggests that {D(f,f)} also converges to zero. This suggests that {f} converges to a constant function; but as {f} is also supposed to decay to zero at spatial infinity, we thus expect solutions to the heat equation in {{\bf R}^n} to decay to zero in some sense as {t \rightarrow \infty}. However, the decay is only expected to be polynomial in nature rather than exponential; for instance, the solution (3) decays in the {L^\infty} norm like {O(t^{-n/2})}.

Since {L1=0}, we also observe the basic cancellation property

\displaystyle  \int_{{\bf R}^n} Lf(x) \ dx = 0 \ \ \ \ \ (9)

for any function {f}.

There are other quantities relating to {f} that also decrease in time under heat flow, particularly in the important case when {f} is a probability measure. In this case, it is natural to introduce the entropy

\displaystyle  S(f) := \int_{{\bf R}^n} f(x) \log f(x)\ dx.

Thus, for instance, if {f(x)\ dx} is the uniform distribution on some measurable subset {E} of {{\bf R}^n} of finite measure {|E|}, the entropy would be {-\log |E|}. Intuitively, as the entropy decreases, the probability distribution gets wider and flatter. For instance, in the case of the fundamental solution (3), one has {S(G) = -\frac{n}{2} \log( 2 \pi e t )} for any {t>0}, reflecting the fact that {G(t)} is approximately uniformly distributed on a ball of radius {O(\sqrt{t})} (and thus of measure {O(t^{n/2})}).

A short formal computation shows (if one assumes for simplicity that {f} is strictly positive, which is not an unreasonable hypothesis, particularly in view of the strong maximum principle) using (9), (5) that

\displaystyle  \partial_t S(f) = \int_{{\bf R}^n} (Lf) \log f + f \frac{Lf}{f}\ dx

\displaystyle  = \int_{{\bf R}^n} (Lf) \log f\ dx

\displaystyle  = - D( f, \log f )

\displaystyle  = - \frac{1}{2} \int_{{\bf R}^n} \frac{|\nabla f|^2}{f}\ dx

\displaystyle  = - 4D( g, g )

where {g := \sqrt{f}} is the square root of {f}. For instance, if {f} is the fundamental solution (3), one can check that {D(g,g) = \frac{n}{8t}} (note that this is a significantly cleaner formula than (7)!).

In particular, the entropy is decreasing, which corresponds well to one’s intuition that the heat equation (or Brownian motion) should serve to spread out a probability distribution over time.

Actually, one can say more: the rate of decrease {4D(g,g)} of the entropy is itself decreasing, or in other words the entropy is convex. I do not have a satisfactorily intuitive reason for this phenomenon, but it can be proved by straightforward application of basic several variable calculus tools (such as the chain rule, product rule, quotient rule, and integration by parts), and completing the square. Namely, by using the chain rule we have

\displaystyle  L \phi(f) = \phi'(f) Lf + \frac{1}{2} \phi''(f) |\nabla f|^2, \ \ \ \ \ (10)

valid for for any smooth function {\phi: {\bf R} \rightarrow {\bf R}}, we see from (1) that

\displaystyle  2 g \partial_t g = 2 g L g + |\nabla g|^2

and thus (again assuming that {f}, and hence {g}, is strictly positive to avoid technicalities)

\displaystyle  \partial_t g = Lg + \frac{|\nabla g|^2}{2g}.

We thus have

\displaystyle  \partial_t D(g,g) = 2 D(g,Lg) + D(g, \frac{|\nabla g|^2}{g} ).

It is now convenient to compute using the Einstein summation convention to hide the summation over indices {i,j = 1,\ldots,n}. We have

\displaystyle  2 D(g,Lg) = \frac{1}{2} \int_{{\bf R}^n} (\partial_i g) (\partial_i \partial_j \partial_j g)\ dx

and

\displaystyle  D(g, \frac{|\nabla g|^2}{g} ) = \frac{1}{2} \int_{{\bf R}^n} (\partial_i g) \partial_i \frac{\partial_j g \partial_j g}{g}\ dx.

By integration by parts and interchanging partial derivatives, we may write the first integral as

\displaystyle  2 D(g,Lg) = - \frac{1}{2} \int_{{\bf R}^n} (\partial_i \partial_j g) (\partial_i \partial_j g)\ dx,

and from the quotient and product rules, we may write the second integral as

\displaystyle  D(g, \frac{|\nabla g|^2}{g} ) = \int_{{\bf R}^n} \frac{(\partial_i g) (\partial_j g) (\partial_i \partial_j g)}{g} - \frac{(\partial_i g) (\partial_j g) (\partial_i g) (\partial_j g)}{2g^2}\ dx.

Gathering terms, completing the square, and making the summations explicit again, we see that

\displaystyle  \partial_t D(g,g) =- \frac{1}{2} \int_{{\bf R}^n} \frac{\sum_{i=1}^n \sum_{j=1}^n |g \partial_i \partial_j g - (\partial_i g) (\partial_j g)|^2}{g^2}\ dx

and so in particular {D(g,g)} is always decreasing.

The above identity can also be written as

\displaystyle  \partial_t D(g,g) = - \frac{1}{2} \int_{{\bf R}^n} |\nabla^2 \log g|^2 g^2\ dx.

Exercise 1 Give an alternate proof of the above identity by writing {f = e^{2u}}, {g = e^u} and deriving the equation {\partial_t u = Lu + |\nabla u|^2} for {u}.

It was observed in a well known paper of Bakry and Emery that the above monotonicity properties hold for a much larger class of heat flow-type equations, and lead to a number of important relations between energy and entropy, such as the log-Sobolev inequality of Gross and of Federbush, and the hypercontractivity inequality of Nelson; we will discuss one such family of generalisations (or more precisely, variants) below the fold.

Read the rest of this entry »

Lars Hörmander, who made fundamental contributions to all areas of partial differential equations, but particularly in developing the analysis of variable-coefficient linear PDE, died last Sunday, aged 81.

I unfortunately never met Hörmander personally, but of course I encountered his work all the time while working in PDE. One of his major contributions to the subject was to systematically develop the calculus of Fourier integral operators (FIOs), which are a substantial generalisation of pseudodifferential operators and which can be used to (approximately) solve linear partial differential equations, or to transform such equations into a more convenient form. Roughly speaking, Fourier integral operators are to linear PDE as canonical transformations are to Hamiltonian mechanics (and one can in fact view FIOs as a quantisation of a canonical transformation). They are a large class of transformations, for instance the Fourier transform, pseudodifferential operators, and smooth changes of the spatial variable are all examples of FIOs, and (as long as certain singular situations are avoided) the composition of two FIOs is again an FIO.

The full theory of FIOs is quite extensive, occupying the entire final volume of Hormander’s famous four-volume series “The Analysis of Linear Partial Differential Operators”. I am certainly not going to try to attempt to summarise it here, but I thought I would try to motivate how these operators arise when trying to transform functions. For simplicity we will work with functions {f \in L^2({\bf R}^n)} on a Euclidean domain {{\bf R}^n} (although FIOs can certainly be defined on more general smooth manifolds, and there is an extension of the theory that also works on manifolds with boundary). As this will be a heuristic discussion, we will ignore all the (technical, but important) issues of smoothness or convergence with regards to the functions, integrals and limits that appear below, and be rather vague with terms such as “decaying” or “concentrated”.

A function {f \in L^2({\bf R}^n)} can be viewed from many different perspectives (reflecting the variety of bases, or approximate bases, that the Hilbert space {L^2({\bf R}^n)} offers). Most directly, we have the physical space perspective, viewing {f} as a function {x \mapsto f(x)} of the physical variable {x \in {\bf R}^n}. In many cases, this function will be concentrated in some subregion {\Omega} of physical space. For instance, a gaussian wave packet

\displaystyle  f(x) = A e^{-(x-x_0)^2/\hbar} e^{i \xi_0 \cdot x/\hbar}, \ \ \ \ \ (1)

where {\hbar > 0}, {A \in {\bf C}} and {x_0, \xi_0 \in {\bf R}^n} are parameters, would be physically concentrated in the ball {B(x_0,\sqrt{\hbar})}. Then we have the frequency space (or momentum space) perspective, viewing {f} now as a function {\xi \mapsto \hat f(\xi)} of the frequency variable {\xi \in {\bf R}^n}. For this discussion, it will be convenient to normalise the Fourier transform using a small constant {\hbar > 0} (which has the physical interpretation of Planck’s constant if one is doing quantum mechanics), thus

\displaystyle  \hat f(\xi) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{\bf R} e^{-i\xi \cdot x/\hbar} f(x)\ dx.

For instance, for the gaussian wave packet (1), one has

\displaystyle  \hat f(\xi) = A e^{i\xi_0 \cdot x_0/\hbar} e^{-(\xi-\xi_0)^2/\hbar} e^{-i \xi \cdot x_0/\hbar},

and so we see that {f} is concentrated in frequency space in the ball {B(\xi_0,\sqrt{\hbar})}.

However, there is a third (but less rigorous) way to view a function {f} in {L^2({\bf R}^n)}, which is the phase space perspective in which one tries to view {f} as distributed simultaneously in physical space and in frequency space, thus being something like a measure on the phase space {T^* {\bf R}^n := \{ (x,\xi): x, \xi \in {\bf R}^n\}}. Thus, for instance, the function (1) should heuristically be concentrated on the region {B(x_0,\sqrt{\hbar}) \times B(\xi_0,\sqrt{\hbar})} in phase space. Unfortunately, due to the uncertainty principle, there is no completely satisfactory way to canonically and rigorously define what the “phase space portrait” of a function {f} should be. (For instance, the Wigner transform of {f} can be viewed as an attempt to describe the distribution of the {L^2} energy of {f} in phase space, except that this transform can take negative or even complex values; see Folland’s book for further discussion.) Still, it is a very useful heuristic to think of functions has having a phase space portrait, which is something like a non-negative measure on phase space that captures the distribution of functions in both space and frequency, albeit with some “quantum fuzziness” that shows up whenever one tries to inspect this measure at scales of physical space and frequency space that together violate the uncertainty principle. (The score of a piece of music is a good everyday example of a phase space portrait of a function, in this case a sound wave; here, the physical space is the time axis (the horizontal dimension of the score) and the frequency space is the vertical dimension. Here, the time and frequency scales involved are well above the uncertainty principle limit (a typical note lasts many hundreds of cycles, whereas the uncertainty principle kicks in at {O(1)} cycles) and so there is no obstruction here to musical notation being unambiguous.) Furthermore, if one takes certain asymptotic limits, one can recover a precise notion of a phase space portrait; for instance if one takes the semiclassical limit {\hbar \rightarrow 0} then, under certain circumstances, the phase space portrait converges to a well-defined classical probability measure on phase space; closely related to this is the high frequency limit of a fixed function, which among other things defines the wave front set of that function, which can be viewed as another asymptotic realisation of the phase space portrait concept.

If functions in {L^2({\bf R}^n)} can be viewed as a sort of distribution in phase space, then linear operators {T: L^2({\bf R}^n) \rightarrow L^2({\bf R}^n)} should be viewed as various transformations on such distributions on phase space. For instance, a pseudodifferential operator {a(X,D)} should correspond (as a zeroth approximation) to multiplying a phase space distribution by the symbol {a(x,\xi)} of that operator, as discussed in this previous blog post. Note that such operators only change the amplitude of the phase space distribution, but not the support of that distribution.

Now we turn to operators that alter the support of a phase space distribution, rather than the amplitude; we will focus on unitary operators to emphasise the amplitude preservation aspect. These will eventually be key examples of Fourier integral operators. A physical translation {Tf(x) := f(x-x_0)} should correspond to pushing forward the distribution by the transformation {(x,\xi) \mapsto (x+x_0,\xi)}, as can be seen by comparing the physical and frequency space supports of {Tf} with that of {f}. Similarly, a frequency modulation {Tf(x) := e^{i \xi_0 \cdot x/\hbar} f(x)} should correspond to the transformation {(x,\xi) \mapsto (x,\xi+\xi_0)}; a linear change of variables {Tf(x) := |\hbox{det} L|^{-1/2} f(L^{-1} x)}, where {L: {\bf R}^n \rightarrow {\bf R}^n} is an invertible linear transformation, should correspond to {(x,\xi) \mapsto (Lx, (L^*)^{-1} \xi)}; and finally, the Fourier transform {Tf(x) := \hat f(x)} should correspond to the transformation {(x,\xi) \mapsto (\xi,-x)}.

Based on these examples, one may hope that given any diffeomorphism {\Phi: T^* {\bf R}^n \rightarrow T^* {\bf R}^n} of phase space, one could associate some sort of unitary (or approximately unitary) operator {T_\Phi: L^2({\bf R}^n) \rightarrow L^2({\bf R}^n)}, which (heuristically, at least) pushes the phase space portrait of a function forward by {\Phi}. However, there is an obstruction to doing so, which can be explained as follows. If {T_\Phi} pushes phase space portraits by {\Phi}, and pseudodifferential operators {a(X,D)} multiply phase space portraits by {a}, then this suggests the intertwining relationship

\displaystyle  a(X,D) T_\Phi \approx T_\Phi (a \circ \Phi)(X,D),

and thus {(a \circ \Phi)(X,D)} is approximately conjugate to {a(X,D)}:

\displaystyle  (a \circ \Phi)(X,D) \approx T_\Phi^{-1} a(X,D) T_\Phi. \ \ \ \ \ (2)

The formalisation of this fact in the theory of Fourier integral operators is known as Egorov’s theorem, due to Yu Egorov (and not to be confused with the more widely known theorem of Dmitri Egorov in measure theory).

Applying commutators, we conclude the approximate conjugacy relationship

\displaystyle  \frac{1}{i\hbar} [(a \circ \Phi)(X,D), (b \circ \Phi)(X,D)] \approx T_\Phi^{-1} \frac{1}{i\hbar} [a(X,D), b(X,D)] T_\Phi.

Now, the pseudodifferential calculus (as discussed in this previous post) tells us (heuristically, at least) that

\displaystyle  \frac{1}{i\hbar} [a(X,D), b(X,D)] \approx \{ a, b \}(X,D)

and

\displaystyle  \frac{1}{i\hbar} [(a \circ \Phi)(X,D), (b \circ \Phi)(X,D)] \approx \{ a \circ \Phi, b \circ \Phi \}(X,D)

where {\{,\}} is the Poisson bracket. Comparing this with (2), we are then led to the compatibility condition

\displaystyle  \{ a \circ \Phi, b \circ \Phi \} \approx \{ a, b \} \circ \Phi,

thus {\Phi} needs to preserve (approximately, at least) the Poisson bracket, or equivalently {\Phi} needs to be a symplectomorphism (again, approximately at least).

Now suppose that {\Phi: T^* {\bf R}^n \rightarrow T^* {\bf R}^n} is a symplectomorphism. This is morally equivalent to the graph {\Sigma := \{ (z, \Phi(z)): z \in T^* {\bf R}^n \}} being a Lagrangian submanifold of {T^* {\bf R}^n \times T^* {\bf R}^n} (where we give the second copy of phase space the negative {-\omega} of the usual symplectic form {\omega}, thus yielding {\omega \oplus -\omega} as the full symplectic form on {T^* {\bf R}^n \times T^* {\bf R}^n}; this is another instantiation of the closed graph theorem, as mentioned in this previous post. This graph is known as the canonical relation for the (putative) FIO that is associated to {\Phi}. To understand what it means for this graph to be Lagrangian, we coordinatise {T^* {\bf R}^n \times T^* {\bf R}^n} as {(x,\xi,y,\eta)} suppose temporarily that this graph was (locally, at least) a smooth graph in the {x} and {y} variables, thus

\displaystyle  \Sigma = \{ (x, F(x,y), y, G(x,y)): x, y \in {\bf R}^n \}

for some smooth functions {F, G: {\bf R}^n \rightarrow {\bf R}^n}. A brief computation shows that the Lagrangian property of {\Sigma} is then equivalent to the compatibility conditions

\displaystyle  \frac{\partial F_i}{\partial x_j} = \frac{\partial F_j}{\partial x_i}

\displaystyle  \frac{\partial G_i}{\partial y_j} = \frac{\partial G_j}{\partial y_i}

\displaystyle  \frac{\partial F_i}{\partial y_j} = - \frac{\partial G_j}{\partial x_i}

for {i,j=1,\ldots,n}, where {F_1,\ldots,F_n, G_1,\ldots,G_n} denote the components of {F,G}. Some Fourier analysis (or Hodge theory) lets us solve these equations as

\displaystyle  F_i = -\frac{\partial \phi}{\partial x_i}; \quad G_j = \frac{\partial \phi}{\partial y_j}

for some smooth potential function {\phi: {\bf R}^n \times {\bf R}^n \rightarrow {\bf R}}. Thus, we have parameterised our graph {\Sigma} as

\displaystyle  \Sigma = \{ (x, -\nabla_x \phi(x,y), y, \nabla_y \phi(x,y)): x,y \in {\bf R}^n \} \ \ \ \ \ (3)

so that {\Phi} maps {(x, -\nabla_x \phi(x,y))} to {(y, \nabla_y \phi(x,y))}.

A reasonable candidate for an operator associated to {\Phi} and {\Sigma} in this fashion is the oscillatory integral operator

\displaystyle  Tf(y) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i \phi(x,y)/\hbar} a(x,y) f(x)\ dx \ \ \ \ \ (4)

for some smooth amplitude function {a} (note that the Fourier transform is the special case when {a=1} and {\phi(x,y)=xy}, which helps explain the genesis of the term “Fourier integral operator”). Indeed, if one computes an inner product {\int_{{\bf R}^n} Tf(y) \overline{g(y)}\ dy} for gaussian wave packets {f, g} of the form (1) and localised in phase space near {(x_0,\xi_0), (y_0,\eta_0)} respectively, then a Taylor expansion of {\phi} around {(x_0,y_0)}, followed by a stationary phase computation, shows (again heuristically, and assuming {\phi} is suitably non-degenerate) that {T} has (3) as its canonical relation. (Furthermore, a refinement of this stationary phase calculation suggests that if {a} is normalised to be the half-density {|\det \nabla_x \nabla_y \phi|^{1/2}}, then {T} should be approximately unitary.) As such, we view (4) as an example of a Fourier integral operator (assuming various smoothness and non-degeneracy hypotheses on the phase {\phi} and amplitude {a} which we do not detail here).

Of course, it may be the case that {\Sigma} is not a graph in the {x,y} coordinates (for instance, the key examples of translation, modulation, and dilation are not of this form), but then it is often a graph in some other pair of coordinates, such as {\xi,y}. In that case one can compose the oscillatory integral construction given above with a Fourier transform, giving another class of FIOs of the form

\displaystyle  Tf(y) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i \phi(\xi,y)/\hbar} a(\xi,y) \hat f(\xi)\ d\xi. \ \ \ \ \ (5)

This class of FIOs covers many important cases; for instance, the translation, modulation, and dilation operators considered earlier can be written in this form after some Fourier analysis. Another typical example is the half-wave propagator {T := e^{it \sqrt{-\Delta}}} for some time {t \in {\bf R}}, which can be written in the form

\displaystyle  Tf(y) = \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i (\xi \cdot y + t |\xi|)/\hbar} a(\xi,y) \hat f(\xi)\ d\xi.

This corresponds to the phase space transformation {(x,\xi) \mapsto (x+t\xi/|\xi|, \xi)}, which can be viewed as the classical propagator associated to the “quantum” propagator {e^{it\sqrt{-\Delta}}}. More generally, propagators for linear Hamiltonian partial differential equations can often be expressed (at least approximately) by Fourier integral operators corresponding to the propagator of the associated classical Hamiltonian flow associated to the symbol of the Hamiltonian operator {H}; this leads to an important mathematical formalisation of the correspondence principle between quantum mechanics and classical mechanics, that is one of the foundations of microlocal analysis and which was extensively developed in Hörmander’s work. (More recently, numerically stable versions of this theory have been developed to allow for rapid and accurate numerical solutions to various linear PDE, for instance through Emmanuel Candés’ theory of curvelets, so the theory that Hörmander built now has some quite significant practical applications in areas such as geology.)

In some cases, the canonical relation {\Sigma} may have some singularities (such as fold singularities) which prevent it from being written as graphs in the previous senses, but the theory for defining FIOs even in these cases, and in developing their calculus, is now well established, in large part due to the foundational work of Hörmander.

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,977 other followers