You are currently browsing the tag archive for the ‘conservation laws’ tag.

Throughout this post, we will work only at the formal level of analysis, ignoring issues of convergence of integrals, justifying differentiation under the integral sign, and so forth. (Rigorous justification of the conservation laws and other identities arising from the formal manipulations below can usually be established in an a posteriori fashion once the identities are in hand, without the need to rigorously justify the manipulations used to come up with these identities).

It is a remarkable fact in the theory of differential equations that many of the ordinary and partial differential equations that are of interest (particularly in geometric PDE, or PDE arising from mathematical physics) admit a variational formulation; thus, a collection {\Phi: \Omega \rightarrow M} of one or more fields on a domain {\Omega} taking values in a space {M} will solve the differential equation of interest if and only if {\Phi} is a critical point to the functional

\displaystyle  J[\Phi] := \int_\Omega L( x, \Phi(x), D\Phi(x) )\ dx \ \ \ \ \ (1)

involving the fields {\Phi} and their first derivatives {D\Phi}, where the Lagrangian {L: \Sigma \rightarrow {\bf R}} is a function on the vector bundle {\Sigma} over {\Omega \times M} consisting of triples {(x, q, \dot q)} with {x \in \Omega}, {q \in M}, and {\dot q: T_x \Omega \rightarrow T_q M} a linear transformation; we also usually keep the boundary data of {\Phi} fixed in case {\Omega} has a non-trivial boundary, although we will ignore these issues here. (We also ignore the possibility of having additional constraints imposed on {\Phi} and {D\Phi}, which require the machinery of Lagrange multipliers to deal with, but which will only serve as a distraction for the current discussion.) It is common to use local coordinates to parameterise {\Omega} as {{\bf R}^d} and {M} as {{\bf R}^n}, in which case {\Sigma} can be viewed locally as a function on {{\bf R}^d \times {\bf R}^n \times {\bf R}^{dn}}.

Example 1 (Geodesic flow) Take {\Omega = [0,1]} and {M = (M,g)} to be a Riemannian manifold, which we will write locally in coordinates as {{\bf R}^n} with metric {g_{ij}(q)} for {i,j=1,\dots,n}. A geodesic {\gamma: [0,1] \rightarrow M} is then a critical point (keeping {\gamma(0),\gamma(1)} fixed) of the energy functional

\displaystyle  J[\gamma] := \frac{1}{2} \int_0^1 g_{\gamma(t)}( D\gamma(t), D\gamma(t) )\ dt

or in coordinates (ignoring coordinate patch issues, and using the usual summation conventions)

\displaystyle  J[\gamma] = \frac{1}{2} \int_0^1 g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t)\ dt.

As discussed in this previous post, both the Euler equations for rigid body motion, and the Euler equations for incompressible inviscid flow, can be interpreted as geodesic flow (though in the latter case, one has to work really formally, as the manifold {M} is now infinite dimensional).

More generally, if {\Omega = (\Omega,h)} is itself a Riemannian manifold, which we write locally in coordinates as {{\bf R}^d} with metric {h_{ab}(x)} for {a,b=1,\dots,d}, then a harmonic map {\Phi: \Omega \rightarrow M} is a critical point of the energy functional

\displaystyle  J[\Phi] := \frac{1}{2} \int_\Omega h(x) \otimes g_{\gamma(x)}( D\gamma(x), D\gamma(x) )\ dh(x)

or in coordinates (again ignoring coordinate patch issues)

\displaystyle  J[\Phi] = \frac{1}{2} \int_{{\bf R}^d} h_{ab}(x) g_{ij}(\Phi(x)) (\partial_a \Phi^i(x)) (\partial_b \Phi^j(x))\ \sqrt{\det(h(x))}\ dx.

If we replace the Riemannian manifold {\Omega} by a Lorentzian manifold, such as Minkowski space {{\bf R}^{1+3}}, then the notion of a harmonic map is replaced by that of a wave map, which generalises the scalar wave equation (which corresponds to the case {M={\bf R}}).

Example 2 ({N}-particle interactions) Take {\Omega = {\bf R}} and {M = {\bf R}^3 \otimes {\bf R}^N}; then a function {\Phi: \Omega \rightarrow M} can be interpreted as a collection of {N} trajectories {q_1,\dots,q_N: {\bf R} \rightarrow {\bf R}^3} in space, which we give a physical interpretation as the trajectories of {N} particles. If we assign each particle a positive mass {m_1,\dots,m_N > 0}, and also introduce a potential energy function {V: M \rightarrow {\bf R}}, then it turns out that Newton’s laws of motion {F=ma} in this context (with the force {F_i} on the {i^{th}} particle being given by the conservative force {-\nabla_{q_i} V}) are equivalent to the trajectories {q_1,\dots,q_N} being a critical point of the action functional

\displaystyle  J[\Phi] := \int_{\bf R} \sum_{i=1}^N \frac{1}{2} m_i |\dot q_i(t)|^2 - V( q_1(t),\dots,q_N(t) )\ dt.

Formally, if {\Phi = \Phi_0} is a critical point of a functional {J[\Phi]}, this means that

\displaystyle  \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0

whenever {s \mapsto \Phi[s]} is a (smooth) deformation with {\Phi[0]=\Phi_0} (and with {\Phi[s]} respecting whatever boundary conditions are appropriate). Interchanging the derivative and integral, we (formally, at least) arrive at

\displaystyle  \int_\Omega \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}\ dx = 0. \ \ \ \ \ (2)

Write {\delta \Phi := \frac{d}{ds} \Phi[s]|_{s=0}} for the infinitesimal deformation of {\Phi_0}. By the chain rule, {\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}} can be expressed in terms of {x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}. In coordinates, we have

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \delta \Phi^i(x) L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) \ \ \ \ \ (3)

\displaystyle  + \partial_{x^a} \delta \Phi^i(x) L_{\partial_{x^a} q^i} (x,\Phi_0(x), D\Phi_0(x)),

where we parameterise {\Sigma} by {x, (q^i)_{i=1,\dots,n}, (\partial_{x^a} q^i)_{a=1,\dots,d; i=1,\dots,n}}, and we use subscripts on {L} to denote partial derivatives in the various coefficients. (One can of course work in a coordinate-free manner here if one really wants to, but the notation becomes a little cumbersome due to the need to carefully split up the tangent space of {\Sigma}, and we will not do so here.) Thus we can view (2) as an integral identity that asserts the vanishing of a certain integral, whose integrand involves {x, \Phi_0(x), \delta \Phi(x), D\Phi_0(x), D \delta \Phi(x)}, where {\delta \Phi} vanishes at the boundary but is otherwise unconstrained.

A general rule of thumb in PDE and calculus of variations is that whenever one has an integral identity of the form {\int_\Omega F(x)\ dx = 0} for some class of functions {F} that vanishes on the boundary, then there must be an associated differential identity {F = \hbox{div} X} that justifies this integral identity through Stokes’ theorem. This rule of thumb helps explain why integration by parts is used so frequently in PDE to justify integral identities. The rule of thumb can fail when one is dealing with “global” or “cohomologically non-trivial” integral identities of a topological nature, such as the Gauss-Bonnet or Kazhdan-Warner identities, but is quite reliable for “local” or “cohomologically trivial” identities, such as those arising from calculus of variations.

In any case, if we apply this rule to (2), we expect that the integrand {\frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0}} should be expressible as a spatial divergence. This is indeed the case:

Proposition 1 (Formal) Let {\Phi = \Phi_0} be a critical point of the functional {J[\Phi]} defined in (1). Then for any deformation {s \mapsto \Phi[s]} with {\Phi[0] = \Phi_0}, we have

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} X \ \ \ \ \ (4)

where {X} is the vector field that is expressible in coordinates as

\displaystyle  X^a := \delta \Phi^i(x) L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)). \ \ \ \ \ (5)

Proof: Comparing (4) with (3), we see that the claim is equivalent to the Euler-Lagrange equation

\displaystyle  L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) = 0. \ \ \ \ \ (6)

The same computation, together with an integration by parts, shows that (2) may be rewritten as

\displaystyle  \int_\Omega ( L_{q^i}(x,\Phi_0(x), D\Phi_0(x)) - \partial_{x^a} L_{\partial_{x^a} q^i}(x,\Phi_0(x), D\Phi_0(x)) ) \delta \Phi^i(x)\ dx = 0.

Since {\delta \Phi^i(x)} is unconstrained on the interior of {\Omega}, the claim (6) follows (at a formal level, at least). \Box

Many variational problems also enjoy one-parameter continuous symmetries: given any field {\Phi_0} (not necessarily a critical point), one can place that field in a one-parameter family {s \mapsto \Phi[s]} with {\Phi[0] = \Phi_0}, such that

\displaystyle  J[ \Phi[s] ] = J[ \Phi[0] ]

for all {s}; in particular,

\displaystyle  \frac{d}{ds} J[ \Phi[s] ]|_{s=0} = 0,

which can be written as (2) as before. Applying the previous rule of thumb, we thus expect another divergence identity

\displaystyle  \frac{d}{ds} L( x, \Phi[s](x), D\Phi[s](x) )|_{s=0} = \hbox{div} Y \ \ \ \ \ (7)

whenever {s \mapsto \Phi[s]} arises from a continuous one-parameter symmetry. This expectation is indeed the case in many examples. For instance, if the spatial domain {\Omega} is the Euclidean space {{\bf R}^d}, and the Lagrangian (when expressed in coordinates) has no direct dependence on the spatial variable {x}, thus

\displaystyle  L( x, \Phi(x), D\Phi(x) ) = L( \Phi(x), D\Phi(x) ), \ \ \ \ \ (8)

then we obtain {d} translation symmetries

\displaystyle  \Phi[s](x) := \Phi(x - s e^a )

for {a=1,\dots,d}, where {e^1,\dots,e^d} is the standard basis for {{\bf R}^d}. For a fixed {a}, the left-hand side of (7) then becomes

\displaystyle  \frac{d}{ds} L( \Phi(x-se^a), D\Phi(x-se^a) )|_{s=0} = -\partial_{x^a} [ L( \Phi(x), D\Phi(x) ) ]

\displaystyle  = \hbox{div} Y

where {Y(x) = - L(\Phi(x), D\Phi(x)) e^a}. Another common type of symmetry is a pointwise symmetry, in which

\displaystyle  L( x, \Phi[s](x), D\Phi[s](x) ) = L( x, \Phi[0](x), D\Phi[0](x) ) \ \ \ \ \ (9)

for all {x}, in which case (7) clearly holds with {Y=0}.

If we subtract (4) from (7), we obtain the celebrated theorem of Noether linking symmetries with conservation laws:

Theorem 2 (Noether’s theorem) Suppose that {\Phi_0} is a critical point of the functional (1), and let {\Phi[s]} be a one-parameter continuous symmetry with {\Phi[0] = \Phi_0}. Let {X} be the vector field in (5), and let {Y} be the vector field in (7). Then we have the pointwise conservation law

\displaystyle  \hbox{div}(X-Y) = 0.

In particular, for one-dimensional variational problems, in which {\Omega \subset {\bf R}}, we have the conservation law {(X-Y)(t) = (X-Y)(0)} for all {t \in \Omega} (assuming of course that {\Omega} is connected and contains {0}).

Noether’s theorem gives a systematic way to locate conservation laws for solutions to variational problems. For instance, if {\Omega \subset {\bf R}} and the Lagrangian has no explicit time dependence, thus

\displaystyle  L(t, \Phi(t), \dot \Phi(t)) = L(\Phi(t), \dot \Phi(t)),

then by using the time translation symmetry {\Phi[s](t) := \Phi(t-s)}, we have

\displaystyle  Y(t) = - L( \Phi(t), \dot\Phi(t) )

as discussed previously, whereas we have {\delta \Phi(t) = - \dot \Phi(t)}, and hence by (5)

\displaystyle  X(t) := - \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t)),

and so Noether’s theorem gives conservation of the Hamiltonian

\displaystyle  H(t) := \dot \Phi^i(x) L_{\dot q^i}(\Phi(t), \dot \Phi(t))- L(\Phi(t), \dot \Phi(t)). \ \ \ \ \ (10)

For instance, for geodesic flow, the Hamiltonian works out to be

\displaystyle  H(t) = \frac{1}{2} g_{ij}(\gamma(t)) \dot \gamma^i(t) \dot \gamma^j(t),

so we see that the speed of the geodesic is conserved over time.

For pointwise symmetries (9), {Y} vanishes, and so Noether’s theorem simplifies to {\hbox{div} X = 0}; in the one-dimensional case {\Omega \subset {\bf R}}, we thus see from (5) that the quantity

\displaystyle  \delta \Phi^i(t) L_{\dot q^i}(t,\Phi_0(t), \dot \Phi_0(t)) \ \ \ \ \ (11)

is conserved in time. For instance, for the {N}-particle system in Example 2, if we have the translation invariance

\displaystyle  V( q_1 + h, \dots, q_N + h ) = V( q_1, \dots, q_N )

for all {q_1,\dots,q_N,h \in {\bf R}^3}, then we have the pointwise translation symmetry

\displaystyle  q_i[s](t) := q_i(t) + s e^j

for all {i=1,\dots,N}, {s \in{\bf R}} and some {j=1,\dots,3}, in which case {\dot q_i(t) = e^j}, and the conserved quantity (11) becomes

\displaystyle  \sum_{i=1}^n m_i \dot q_i^j(t);

as {j=1,\dots,3} was arbitrary, this establishes conservation of the total momentum

\displaystyle  \sum_{i=1}^n m_i \dot q_i(t).

Similarly, if we have the rotation invariance

\displaystyle  V( R q_1, \dots, Rq_N ) = V( q_1, \dots, q_N )

for any {q_1,\dots,q_N \in {\bf R}^3} and {R \in SO(3)}, then we have the pointwise rotation symmetry

\displaystyle  q_i[s](t) := \exp( s A ) q_i(t)

for any skew-symmetric real {3 \times 3} matrix {A}, in which case {\dot q_i(t) = A q_i(t)}, and the conserved quantity (11) becomes

\displaystyle  \sum_{i=1}^n m_i \langle A q_i(t), \dot q_i(t) \rangle;

since {A} is an arbitrary skew-symmetric matrix, this establishes conservation of the total angular momentum

\displaystyle  \sum_{i=1}^n m_i q_i(t) \wedge \dot q_i(t).

Below the fold, I will describe how Noether’s theorem can be used to locate all of the conserved quantities for the Euler equations of inviscid fluid flow, discussed in this previous post, by interpreting that flow as geodesic flow in an infinite dimensional manifold.

Read the rest of this entry »

The Euler equations for incompressible inviscid fluids may be written as

\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p

\displaystyle \nabla \cdot u = 0

where {u: [0,T] \times {\bf R}^n \rightarrow {\bf R}^n} is the velocity field, and {p: [0,T] \times {\bf R}^n \rightarrow {\bf R}} is the pressure field. To avoid technicalities we will assume that both fields are smooth, and that {u} is bounded. We will take the dimension {n} to be at least two, with the three-dimensional case {n=3} being of course especially interesting.

The Euler equations are the inviscid limit of the Navier-Stokes equations; as discussed in my previous post, one potential route to establishing finite time blowup for the latter equations when {n=3} is to be able to construct “computers” solving the Euler equations, which generate smaller replicas of themselves in a noise-tolerant manner (as the viscosity term in the Navier-Stokes equation is to be viewed as perturbative noise).

Perhaps the most prominent obstacles to this route are the conservation laws for the Euler equations, which limit the types of final states that a putative computer could reach from a given initial state. Most famously, we have the conservation of energy

\displaystyle \int_{{\bf R}^n} |u|^2\ dx \ \ \ \ \ (1)

 

(assuming sufficient decay of the velocity field at infinity); thus for instance it would not be possible for a computer to generate a replica of itself which had greater total energy than the initial computer. This by itself is not a fatal obstruction (in this paper of mine, I constructed such a “computer” for an averaged Euler equation that still obeyed energy conservation). However, there are other conservation laws also, for instance in three dimensions one also has conservation of helicity

\displaystyle \int_{{\bf R}^3} u \cdot (\nabla \times u)\ dx \ \ \ \ \ (2)

 

and (formally, at least) one has conservation of momentum

\displaystyle \int_{{\bf R}^3} u\ dx

and angular momentum

\displaystyle \int_{{\bf R}^3} x \times u\ dx

(although, as we shall discuss below, due to the slow decay of {u} at infinity, these integrals have to either be interpreted in a principal value sense, or else replaced with their vorticity-based formulations, namely impulse and moment of impulse). Total vorticity

\displaystyle \int_{{\bf R}^3} \nabla \times u\ dx

is also conserved, although it turns out in three dimensions that this quantity vanishes when one assumes sufficient decay at infinity. Then there are the pointwise conservation laws: the vorticity and the volume form are both transported by the fluid flow, while the velocity field (when viewed as a covector) is transported up to a gradient; among other things, this gives the transport of vortex lines as well as Kelvin’s circulation theorem, and can also be used to deduce the helicity conservation law mentioned above. In my opinion, none of these laws actually prohibits a self-replicating computer from existing within the laws of ideal fluid flow, but they do significantly complicate the task of actually designing such a computer, or of the basic “gates” that such a computer would consist of.

Below the fold I would like to record and derive all the conservation laws mentioned above, which to my knowledge essentially form the complete set of known conserved quantities for the Euler equations. The material here (although not the notation) is drawn from this text of Majda and Bertozzi.

Read the rest of this entry »

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

Title: Give yourself an epsilon of room.

Quick description: You want to prove some statement S_0 about some object x_0 (which could be a number, a point, a function, a set, etc.).  To do so, pick a small \varepsilon > 0, and first prove a weaker statement S_\varepsilon (which allows for “losses” which go to zero as \varepsilon \to 0) about some perturbed object x_\varepsilon.  Then, take limits \varepsilon \to 0.  Provided that the dependency and continuity of the weaker conclusion S_\varepsilon on \varepsilon are sufficiently controlled, and x_\varepsilon is converging to x_0 in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement S_\infty about some object X_\infty, by first proving a weaker statement S_N on some approximation X_N to X_\infty for some large parameter N, and then send N \to \infty at the end.

General discussion: Here are some typical examples of a target statement S_0, and the approximating statements S_\varepsilon that would converge to S:

S_0 S_\varepsilon
f(x_0) = g(x_0) f(x_\varepsilon) = g(x_\varepsilon) + o(1)
f(x_0) \leq g(x_0) f(x_\varepsilon) \leq g(x_\varepsilon) + o(1)
f(x_0) > 0 f(x_\varepsilon) \geq c - o(1) for some c>0 independent of \varepsilon
f(x_0) is finite f(x_\varepsilon) is bounded uniformly in \varepsilon
f(x_0) \geq f(x) for all x \in X (i.e. x_0 maximises f) f(x_\varepsilon) \geq f(x)-o(1) for all x \in X (i.e. x_\varepsilon nearly maximises f)
f_n(x_0) converges as n \to \infty f_n(x_\varepsilon) fluctuates by at most o(1) for sufficiently large n
f_0 is a measurable function f_\varepsilon is a measurable function converging pointwise to f_0
f_0 is a continuous function f_\varepsilon is an equicontinuous family of functions converging pointwise to f_0 OR f_\varepsilon is continuous and converges (locally) uniformly to f_0
The event E_0 holds almost surely The event E_\varepsilon holds with probability 1-o(1)
The statement P_0(x) holds for almost every x The statement P_\varepsilon(x) holds for x outside of a set of measure o(1)

Of course, to justify the convergence of S_\varepsilon to S_0, it is necessary that x_\varepsilon converge to x_0 (or f_\varepsilon converge to f_0, etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as f(x_0) \leq M, one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.)  Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control S_\varepsilon on the approximating object x_\varepsilon is somehow “uniform in \varepsilon“, although for “\sigma-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion S_\varepsilon on x_\varepsilon that needs to have this uniformity in \varepsilon; one is permitted to have some intermediate stages in the derivation of S_\varepsilon that depend on \varepsilon in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis.  For instance, by replacing “rough”, “infinite-complexity”, “continuous”,  “global”, or otherwise “infinitary” objects x_0 with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants x_\varepsilon, one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals).  [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. x_\varepsilon should be expected to degrade in the limit \varepsilon \to 0, and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in \varepsilon.]  Similarly, issues such as whether the supremum M := \sup \{ f(x): x \in X \} of a function on a set is actually attained by some maximiser x_0 become moot if one is willing to settle instead for an almost-maximiser x_\varepsilon, e.g. one which comes within an epsilon of that supremum M (or which is larger than 1/\varepsilon, if M turns out to be infinite).  Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

A variant: It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified.  However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day.  (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.)  Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

Prerequisites: Graduate real analysis.  (Actually, this isn’t so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.)  Some examples also require some exposure to PDE.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,862 other followers