This coming fall quarter, I am teaching a class on topics in the mathematical theory of incompressible fluid equations, focusing particularly on the incompressible Euler and Navier-Stokes equations. These two equations are by no means the only equations used to model fluids, but I will focus on these two equations in this course to narrow the focus down to something manageable. I have not fully decided on the choice of topics to cover in this course, but I would probably begin with some core topics such as local well-posedness theory and blowup criteria, conservation laws, and construction of weak solutions, then move on to some topics such as boundary layers and the Prandtl equations, the Euler-Poincare-Arnold interpretation of the Euler equations as an infinite dimensional geodesic flow, and some discussion of the Onsager conjecture. I will probably also continue to more advanced and recent topics in the winter quarter.

In this initial set of notes, we begin by reviewing the physical derivation of the Euler and Navier-Stokes equations from the first principles of Newtonian mechanics, and specifically from Newton’s famous three laws of motion. Strictly speaking, this derivation is not needed for the mathematical analysis of these equations, which can be viewed if one wishes as an arbitrarily chosen system of partial differential equations without any physical motivation; however, I feel that the derivation sheds some insight and intuition on these equations, and is also worth knowing on purely intellectual grounds regardless of its mathematical consequences. I also find it instructive to actually see the journey from Newton’s law

\displaystyle F = ma

to the seemingly rather different-looking law

\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p + \nu \Delta u

\displaystyle \nabla \cdot u = 0

for incompressible Navier-Stokes (or, if one drops the viscosity term {\nu \Delta u}, the Euler equations).

Our discussion in this set of notes is physical rather than mathematical, and so we will not be working at mathematical levels of rigour and precision. In particular we will be fairly casual about interchanging summations, limits, and integrals, we will manipulate approximate identities {X \approx Y} as if they were exact identities (e.g., by differentiating both sides of the approximate identity), and we will not attempt to verify any regularity or convergence hypotheses in the expressions being manipulated. (The same holds for the exercises in this text, which also do not need to be justified at mathematical levels of rigour.) Of course, once we resume the mathematical portion of this course in subsequent notes, such issues will be an important focus of careful attention. This is a basic division of labour in mathematical modeling: non-rigorous heuristic reasoning is used to derive a mathematical model from physical (or other “real-life”) principles, but once a precise model is obtained, the analysis of that model should be completely rigorous if at all possible (even if this requires applying the model to regimes which do not correspond to the original physical motivation of that model). See the discussion by John Ball quoted at the end of these slides of Gero Friesecke for an expansion of these points.

Note: our treatment here will differ slightly from that presented in many fluid mechanics texts, in that it will emphasise first-principles derivations from many-particle systems, rather than relying on bulk laws of physics, such as the laws of thermodynamics, which we will not cover here. (However, the derivations from bulk laws tend to be more robust, in that they are not as reliant on assumptions about the particular interactions between particles. In particular, the physical hypotheses we assume in this post are probably quite a bit stronger than the minimal assumptions needed to justify the Euler or Navier-Stokes equations, which can hold even in situations in which one or more of the hypotheses assumed here break down.)

— 1. From Newton’s laws to the Euler and Navier-Stokes equations —

For obvious reasons, the derivation of the equations of fluid mechanics is customarily presented in the three dimensional setting {d=3} (and sometimes also in the two-dimensional setting {d=2}), but actually the general dimensional case is not that much more difficult (and in some ways clearer, as it reveals that the derivation does not depend on any structures specific to three dimensions, such as the cross product), so for this derivation we will work in the spatial domain {{\bf R}^d} for arbitrary {d}. One could also work with bounded domains {\Omega \subset {\bf R}^d}, or periodic domains such as {{\bf R}^d/{\bf Z}^d}; the derivation is basically the same, thanks to the local nature of the forces of fluid mechanics, except at the boundary {\partial \Omega} where the situation is more subtle (and may be discussed in more detail in later posts). For sake of notational simplicity, we will assume that the time variable {t} ranges over the entire real line {{\bf R}}; again, since the laws of classical mechanics are local in time, one could just as well restrict {t} to some sub-interval of this line, such as {[0,T)} for some time {T>0}.

Our starting point is Newton’s second law {F=ma}, which (partially) describes the motion of a particle {P} of some fixed mass {m > 0} moving in the spatial domain {{\bf R}^d}. (Here we assume that the mass {m} of a particle does not vary with time; in particular, our discussion will be purely non-relativistic in nature, though it is possible to derive a relativistic version of the Euler equations by variants of the arguments given here.) We write Newton’s second law as the ordinary differential equation

\displaystyle m \frac{d^2}{dt^2} x(t) = F(t)

where {x: {\bf R} \rightarrow {\bf R}^d} is the trajectory of the particle (thus {x(t) \in {\bf R}^d} denotes the position of the particle at time {t}), and {F: {\bf R} \rightarrow {\bf R}^d} is the force applied to that particle. If we write {x_1,\dots,x_d: {\bf R} \rightarrow {\bf R}} for the {d} coordinates of the vector-valued function {x: {\bf R} \rightarrow {\bf R}^d}, and similarly write {F_1,\dots,F_d: {\bf R} \rightarrow {\bf R}} for the components of {F}, we therefore have

\displaystyle m \frac{d^2}{dt^2} x_i(t) = F_i(t)

where we adopt in this section the convention that the indices {i,j,k,l} are always understood to range from {1} to {d}.

If one has some collection {(P^{(a)})_{a \in A}} particles instead of a single particle, indexed by some set of labels {A} (e.g. the numbers from {1} to {N}, if there are a finite number {N} of particles; for unbounded domains such as {{\bf R}^d} one can also imagine situations in which {A} is infinite), then for each {a \in A}, the {a^{th}} particle {P^{(a)}} has some mass {m^{(a)} > 0}, some trajectory {x^{(a)}: {\bf R} \rightarrow {\bf R}^d} (with components {x^{(a)}_i: {\bf R} \rightarrow {\bf R}}), and some force applied {F^{(a)}: {\bf R} \rightarrow {\bf R}^d} (with components {F^{(a)}_i: {\bf R} \rightarrow {\bf R}}, we thus have the equation of motion

\displaystyle m^{(a)} \frac{d^2}{dt^2} x^{(a)}(t) = F^{(a)}(t)

or in components

\displaystyle m^{(a)} \frac{d^2}{dt^2} x^{(a)}_i(t) = F^{(a)}_i(t).

In this section we adopt the convention that the indices {a,b,c} are always understood to range over the set of labels {A} for the particles; in particular, their role should not be confused with those of the coordinate indices {i,j,k}.

Newton’s second law does not, by itself, completely specify the evolution of a system of {N} particles, because it does not specify exactly how the forces {F^{(a)}} depend on the current state of the system. For {N} particles, the current state at a given time {t} is given by the positions {x^{(b)}(t), b \in A} of all the particles {P^{(b)}}, as well as their velocities {\frac{d}{dt} x^{(b)}(t), b \in A}; we assume for simplicity that the particles have no further physical characteristics or internal structure of relevance that would require more state variables than these. (No higher derivatives need to be specified beyond the first, thanks to Newton’s second law. On the other hand, specifying position alone is insufficient to describe the state of the system; this was noticed as far back as Zeno in his paradox of the arrow, which in retrospect can be viewed as a precursor to Newton’s second law insofar as it demonstrated that the laws of motion needed to be second-order in time (in contrast, for instance, to Aristotlean physics, which was morally first-order in nature).) At a fundamental level, the dependency of forces on the current state are governed by the laws of physics for such forces; for instance, if the particles interact primarily through electrostatic forces, then one needs the laws of electrostatics to describe these forces. (In some cases, such as electromagnetic interactions, one cannot accurately model the situation purely in terms of interacting particles, and the equations of motion will then involve some additional mediating fields such as the electromagnetic field; but we will ignore this possibility in the current discussion for sake of simplicity.)

Fortunately, thanks to other laws of physics, and in particular Newton’s other two laws of motion, one can still obtain partial information about the forces {F^{(a)}} without having to analyse the fundamental laws producing these forces. For instance, Newton’s first law of motion (when combined with the second) tells us that a single particle {P^{(a)}} does not exert any force on itself; the net force on {P^{(a)}} only arises from interaction with other particles {P^{(b)}}, {b \neq a} (for this discussion we neglect external forces, such as gravity, although one could easily incorporate such forces into this discussion; see Exercise 3 below). We will assume that the only forces present are pair interactions coming from individual pairs of particles {(P^{(a)},P^{(b)})}; it is theoretically possible that one could have more complicated interactions between, say, a triplet {(P^{(a)},P^{(b)}, P^{(c)})} of particles that do not simply arise from the interactions between the three pairs {(P^{(a)},P^{(b)})}, {(P^{(a)},P^{(c)})}, {(P^{(b)},P^{(c)})}, but we will not consider this possibility here. We also assume that the net force on a particle is just the sum of all the interacting forces (i.e., the force addition law contains no nonlinear terms). This gives us a decomposition

\displaystyle F^{(a)} = \sum_{b: b \neq a} F^{(ab)}

of the net force {F^{(a)}: {\bf R} \rightarrow {\bf R}^d} on a particle {P^{(a)}} into the interaction force {F^{(ab)}: {\bf R} \rightarrow {\bf R}^d} exerted on {P^{(a)}} by another particle {P^{(b)}}. Thus the equation of motion is now

\displaystyle m^{(a)} \frac{d^2}{dt^2} x^{(a)}(t) = \sum_{b: b \neq a} F^{(ab)}(t) \ \ \ \ \ (1)


Of course, this description is still incomplete, because we have not specified exactly what the interaction forces {F^{(ab)}} are. But one important constraint on these forces is provided by Newton’s third law

\displaystyle F^{(ba)} = - F^{(ab)}. \ \ \ \ \ (2)


This already gives some restrictions on the possible dynamics. For instance, it implies (formally, at least) that the total momentum

\displaystyle \sum_a m^{(a)} \frac{d}{dt} x^{(a)} \ \ \ \ \ (3)


(which takes values in {{\bf R}^d}) is conserved in time:

\displaystyle \frac{d}{dt} \sum_a m^{(a)} \frac{d}{dt} x^{(a)} = \sum_a m^{(a)} \frac{d^2}{dt^2} x^{(a)}

\displaystyle = \sum_a \sum_{b \neq a} F^{(ab)}

\displaystyle = \frac{1}{2} \sum_{a,b: a \neq b} F^{(ab)} + F^{(ba)}

\displaystyle = 0.

We will also assume that the interaction force {F^{(ab)}} between a pair {P^{(a)}, P^{(b)}} of particles is parallel to the displacement {x^{(a)}-x^{(b)}} between the pair; in other words, we assume the torque {(x^{(a)} - x^{(b)}) \wedge F^{(ab)}} created by this force vanishes, thus

\displaystyle (x^{(a)} - x^{(b)}) \wedge F^{(ab)} = 0. \ \ \ \ \ (4)


Here {\wedge: {\bf R}^d \times {\bf R}^d \rightarrow \bigwedge^2 {\bf R}^d} is the exterior product on {{\bf R}^d} (which in three dimensions can be transformed if one wishes to the cross product, but is well defined in all dimensions). Algebraically, {\wedge} is the universal alternating bilinear form on {{\bf R}^d}; in terms of the standard basis {e^1,\dots,e^d} of {{\bf R}^d}, the wedge product {x \wedge y} of two vectors {x = \sum_i x_i e^i}, {y = \sum_j y_j e^j} is given by

\displaystyle x \wedge y = \sum_{i,j} x_i y_j e_i \wedge e_j = \sum_{i,j: i < j} (x_i y_j - x_j y_i) (e_i \wedge e_j)

and the vector space {\bigwedge^2 {\bf R}^d} is the formal span of the basic wedge products {e_i \wedge e_j} for {i<j}.

One consequence of the absence (4) of torque is the conservation of total angular momentum

\displaystyle \sum_a m^{(a)} x^{(a)} \wedge \frac{d}{dt} x^{(a)} \ \ \ \ \ (5)


(around the spatial origin {x=0}). Indeed, we may calculate

\displaystyle \frac{d}{dt} \sum_a m^{(a)} x^{(a)} \wedge \frac{d}{dt} x^{(a)} = \sum_a m^{(a)} \frac{d}{dt} x^{(a)} \wedge \frac{d}{dt} x^{(a)} + m^{(a)} x^{(a)} \wedge \frac{d^2}{dt^2} x^{(a)}

\displaystyle = \sum_a x^{(a)} \wedge m^{(a)} \frac{d^2}{dt^2} x^{(a)}

\displaystyle = \sum_{a,b: a \neq b} x^{(a)} \wedge F^{(ab)}

\displaystyle = \frac{1}{2} \sum_{a,b: a \neq b} x^{(a)} \wedge F^{(ab)} + x^{(b)} \wedge F^{(ba)}

\displaystyle = \frac{1}{2} \sum_{a,b: a \neq b} (x^{(a)}-x^{(b)}) \wedge F^{(ab)}

\displaystyle =0.

Note that there is nothing special about the spatial origin {x=0}; the angular momentum

\displaystyle \sum_a m^{(a)} (x^{(a)} - x_0) \wedge \frac{d}{dt} x^{(a)}

around any other point {x_0 \in {\bf R}^d} is also conserved in time, as is clear from repeating the above calculation, or by combining the existing conservation laws for (3) and (5).

Now we pass from particle mechanics to continuum mechanics, by considering the limiting (bulk) behaviour of many particle systems as the number {N} of particles per unit volume goes to infinity. (In physically realistic scenarios, {N} will be comparable to Avagadro’s constant, which seems large enough that such limits should be a good approximation to the truth.) To make such limits, we assume that the distribution of particles (and various properties of these particles, such as their velocities and net interaction forces) are approximated in a certain bulk sense by continuous fields. For instance, the mass distribution of a system of particles {(P^{(a)})_{a \in A}} at a given time {t} is given by the discrete measure

\displaystyle \mu_{\mathrm{mass}}(t) := \sum_a m^{(a)} \delta_{x^{(a)}(t)} \ \ \ \ \ (6)


where {\delta_{x_0}} denotes the Dirac measure (or distribution). We will assume that at each time {t}, we have a “bulk” approximation

\displaystyle \mu_{\mathrm{mass}}(t) \approx \rho(t,x)\ dx \ \ \ \ \ (7)


by some continuous measure {\rho(t,x)\ dx}, where the density function {\rho: {\bf R} \times {\bf R}^d \rightarrow {\bf R}^+} is some smooth function of time and space, and {dx} denotes Lebesgue measure on {{\bf R}^d}. What does bulk approximation mean? One could work with various notions of approximation, but we will adopt the viewpoint of the theory of distributions (as reviewed for instance in these old lecture notes of mine) and consider approximation against test functions in spacetime, thus we assume that

\displaystyle \int_{\bf R} \int_{{\bf R}^d} \psi(t,x)\ d \mu_{\mathrm{mass}}(t)(x)\ dt \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) \rho(t,x)\ dx dt. \ \ \ \ \ (8)


for all spacetime test functions {\psi \in C^\infty_c( {\bf R} \times {\bf R}^d )}. (One could also work with purely spatial test functions at each fixed time {t}, or work with “infinitesimal” paralleopipeds or similar domains instead of using test functions; the arguments look slightly different when doing so, but the final equations of motion obtained are the same in all cases. See Exercise 1 for an example of this.) We will be deliberately vague as to what {\approx} means, other than to say that the approximation should only be considered accurate (in the sense that it becomes exact in the limit {N \rightarrow \infty}) when the test function {\psi} exists at “macroscopic” (or “bulk”) spatial scales, in particular it should not oscillate with a wavelength that goes to zero as {N} goes to infinity. (For instance, one certainly expects the approximation (7) to break down if one tries to test it on scales comparable to the mean spacing between particles.)

Applying (6) and evaluating the delta integrations, the approximation (8) becomes

\displaystyle \int_{\bf R} \sum_a m^{(a)} \psi(t,x^{(a)}(t))\ dt \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) \rho(t,x)\ dx dt. \ \ \ \ \ (9)


In a physical liquid, particles in a given small region of space tend to move at nearly identical velocities (as opposed to gases, where Brownian motion effects lead one to expect velocities to be distributed stochastically, for instance in a Maxwellian distribution). To model this, we assume that there exists a smooth velocity field {u: {\bf R} \times {\bf R}^d \rightarrow {\bf R}^d} for which we have the approximation

\displaystyle \frac{d}{dt} x^{(a)}(t) \approx u( t, x^{(a)}(t) ) \ \ \ \ \ (10)


for all particles {P^{(a)}} and all times {t}. (When stochastic effects are significant, the continuum limit of the fluid will be the Boltzmann equations rather than the Euler or Navier-Stokes equations; however the latter equations can still emerge as an approximation of the former in various regimes. See also Remark 7 below.)

Implicit in our model of many-particle interactions is the conservation of mass: each particle {P^{(a)}} has a fixed mass {m^{(a)}}, and no particle is created or destroyed by the evolution. This conservation of mass, when combined with the approximations (9) and (10), gives rise to a certain differential equation relating the density function {\rho} and the velocity field {u}. To see this, first observe from the fundamental theorem of calculus in time that

\displaystyle \int_{\bf R} \frac{d}{dt}(\int_{{\bf R}^d} \psi(t,x) \ d\mu_{\mathrm{mass}}(t)(x)) dt = 0 \ \ \ \ \ (11)


or equivalently (after applying (6) and evaluating the delta integrations)

\displaystyle \int_{\bf R} \frac{d}{dt} \sum_a m^{(a)} \psi(t,x^{(a)}(t))\ dt = 0 \ \ \ \ \ (12)


for any test function {\psi} (note that we make {\psi} compactly supported in both space and time). By the chain rule and (10), we have

\displaystyle \frac{d}{dt} \psi(t,x^{(a)}(t)) = (\partial_t \psi)(t,x^{(a)}(t)) + \frac{d}{dt} x^{(a)}(t) \cdot (\nabla \psi)(t, x^{(a)}(t))

\displaystyle \approx (\partial_t \psi + u \cdot \nabla \psi)(t, x^{(a)}(t))

for any particle {P^{(a)}} and any time {t}, where {\partial_t} denotes the partial derivative with respect to the time variable, {\nabla = (\partial_i)_{i=1,\dots,d}} denotes the spatial gradient (with {\partial_i} denoting partial differentiation in the {x_i} coordinate), and {\cdot} denotes the Euclidean inner product. (One could also use notation here that avoids explicit use of Euclidean structure, for instance writing {d\psi(u)} in place of {u \cdot \nabla \psi}, but it is traditional to use Euclidean notation in fluid mechanics.) This particular combination of derivatives {\partial_t + u \cdot \nabla} appears so often in the subject that we will give it a special name, the material derivative {D_t}:

\displaystyle D_t := \partial_t + u \cdot \nabla.

We have thus obtained the approximation

\displaystyle \frac{d}{dt} \psi(t,x^{(a)}(t)) \approx (D_t \psi)(t,x^{(a)}(t)), \ \ \ \ \ (13)


which on insertion back into (12) yields

\displaystyle \int_{\bf R} \sum_a m^{(a)} (D_t \psi)(t,x^{(a)}(t))\ dt \approx 0.

The material derivative {D_t \psi} of a test function {\psi} will still be a test function {\psi}. Therefore we can use our field approximation (9) to conclude that

\displaystyle \int_{\bf R} \int_{{\bf R}^d} (D_t \psi)(t,x) \rho(t,x)\ dx dt \approx 0

for all test functions {\psi}. The left-hand side consists entirely of the limiting fields, so on taking limits {N \rightarrow \infty} we should therefore have the exact equation

\displaystyle \int_{\bf R} \int_{{\bf R}^d} (D_t \psi)(t,x) \rho(t,x)\ dx dt = 0.

We can integrate by parts to obtain

\displaystyle \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) (D_t^* \rho)(t,x)\ dx dt = 0

where {D_t^*} is the adjoint material derivative

\displaystyle D_t^* \rho := - \partial_t \rho - \nabla \cdot (\rho u).

Since the test function {\psi} was arbitrary, we conclude the continuity equation

\displaystyle D_t^* \rho = 0 \ \ \ \ \ (14)


or equivalently (and more customarily)

\displaystyle \partial_t \rho + \nabla \cdot (\rho u) = 0. \ \ \ \ \ (15)


We will eventually specialise to the case of incompressible fluids in which the density {\rho} is a non-zero constant in both space and time. (In some texts, one uses incompressibility to refer only to constancy of {\rho} along trajectories: {D_t \rho = 0}. But in this course we always use incompressibility to refer to homogeneous incompressibility, in which {\rho} is constant in both space and time.) In this incompressible case, the continuity equation (15) simplifies to a divergence-free condition on the velocity:

\displaystyle \nabla \cdot u = 0. \ \ \ \ \ (16)


For now, though, we allow for the possibility of compressibility by allowing {\rho} to vary in space and time. We also note that by integrating (15) in space, we formally obtain conservation of the total mass

\displaystyle \int_{{\bf R}^d} \rho(t,x)\ dx

since on differentiating under the integral sign and then integrating by parts we formally have

\displaystyle \partial_t \int_{{\bf R}^d} \rho(t,x)\ dx = 0.

Of course, this conservation law degenerates in the incompressible case, since the total mass {\int_{{\bf R}^d} \rho(t,x)\ dx} is manifestly an infinite constant. (In periodic settings, for instance if one is working in {{\bf R}^d/{\bf Z}^d} instead of {{\bf R}^d}, the total mass {\int_{{\bf R}^d/{\bf Z}^d} \rho(t,x)\ dx} is manifestly a finite constant in the incompressible case.)

Exercise 1

  • (i) Assume that the spacetime bulk approximation (8) is replaced by the spatial bulk approximation

    \displaystyle \int_{{\bf R}^d} \psi(x)\ d \mu_{\mathrm{mass}}(t)(x) \approx \int_{{\bf R}^d} \psi(x) \rho(t,x)\ dx

    for any time {t} and any spatial test function {\psi \in C^\infty_c({\bf R}^d)}. Give an alternate heuristic derivation of the continuity equation (15) in this case, without using any integration in time. (Feel free to differentiate under the integral sign.)

  • (ii) Assume that the spacetime bulk approximation (8) is replaced by the spatial bulk approximation

    \displaystyle \int_\Omega\ d\mu_{\mathrm{mass}}(t)(x) \approx \int_\Omega \rho(t,x)\ dx

    for any time {t} and any “reasonable” set {\Omega} (e.g., a rectangular box). Give an alternate heuristic derivation of the continuity equation (15) in this case. (Feel free to introduce infinitesimals and argue non-rigorously with them.)

We can repeat the above analysis with the mass distribution (6) replaced by the momentum distribution

\displaystyle \mu_{\mathrm{momentum}} := \sum_a m^{(a)} (\frac{d}{dt} x^{(a)}(t)) \delta_{x^{(a)}(t)}, \ \ \ \ \ (17)


thus we now wish to exploit the conservation of momentum rather than conservation of mass. The measure {\mu_{\mathrm{momentum}}} is a vector-valued measure, or equivalently a vector {\mu_{\mathrm{momentum}} = (\mu_{\mathrm{momentum},1},\dots,\mu_{\mathrm{momentum},d})} of scalar measures

\displaystyle \mu_{\mathrm{momentum},i} := \sum_a m^{(a)} (\frac{d}{dt} x^{(a)}_i(t)) \delta_{x^{(a)}(t)}.

Instead of starting with the identity (11), we begin with the momentum counterpart

\displaystyle \int_{\bf R} \frac{d}{dt}(\int_{{\bf R}^d} \psi(t,x) \ d\mu_{\mathrm{momentum}}(t)(x)) dt = 0

which on applying (17) and evaluating the delta integrations becomes

\displaystyle \int_{\bf R} \frac{d}{dt} \sum_a m^{(a)} (\frac{d}{dt} x^{(a)}(t)) \psi(t,x^{(a)}(t))\ dt = 0.

Using the product rule and (13), the left-hand side is approximately

\displaystyle \int_{\bf R} \sum_a m^{(a)} (\frac{d^2}{dt^2} x^{(a)}(t)) \psi(t,x^{(a)}(t)) + m^{(a)} \frac{d}{dt} x^{(a)}(t) D_t \psi(t, x^{(a)}(t))\ dt. \ \ \ \ \ (18)


Applying (1) and (10), we conclude

\displaystyle \int_{\bf R} \sum_{a,b: b \neq a} F^{(ab)}(t) \psi(t,x^{(a)}(t)) + m^{(a)} (u D_t \psi)(t, x^{(a)}(t))\ dt \approx 0.

We can evaluate the second term using (9) to obtain

\displaystyle \int_{\bf R} \sum_{a,b: b \neq a} F^{(ab)}(t) \psi(t,x^{(a)}(t))\ dt + \int_{\bf R} \int_{{\bf R}^d} (u D_t \psi)(t, x) \rho(t,x)\ dx dt \approx 0.

What about the first term? We can use symmetry and Newton’s third law (2) to write

\displaystyle \sum_{a,b: b \neq a} F^{(ab)}(t) \psi(t,x^{(a)}(t))

\displaystyle = \frac{1}{2} \sum_{a,b: b \neq a} F^{(ab)}(t) \psi(t,x^{(a)}(t)) + F^{(ba)}(t) \psi(t,x^{(b)}(t))

\displaystyle = \frac{1}{2} \sum_{a,b: b \neq a} F^{(ab)}(t) ( \psi(t,x^{(a)}(t)) -\psi(t,x^{(b)}(t)) ).

Now we make the further physical assumption that the only significant interactions between particles {P^{(a)}, P^{(b)}} are short-range interactions, in which {x^{(a)}(t)} and {x^{(b)}(t)} are very close to each other. With this hypothesis, it is then plausible to make the Taylor approximation

\displaystyle \psi(t,x^{(a)}(t)) -\psi(t,x^{(b)}(t)) \approx (x^{(a)}(t)-x^{(b)}(t)) \cdot \nabla \psi( t, x^{(a)}(t) ).

We thus have

\displaystyle \frac{1}{2} \int_{\bf R} \sum_{a,b: b \neq a} F^{(ab)}(t) (x^{(a)}(t)-x^{(b)}(t)) \cdot \nabla \psi( t, x^{(a)}(t) ) \ dt

\displaystyle + \int_{\bf R} \int_{{\bf R}^d} (u D_t \psi)(t, x) \rho(t,x)\ dx dt \approx 0.

We write this in coordinates as

\displaystyle -\int_{\bf R} \sum_a \Sigma^{(a)}_{ij}(t) \partial_j \psi(t, x^{(a)}(t))\ dt \ \ \ \ \ (19)


\displaystyle + \int_{\bf R} \int_{{\bf R}^d} (u_i D_t \psi)(t,x) \rho(t,x)\ dx dt \approx 0

for {i=1,\dots,d}, where we use the Einstein convention that indices {i,j,k} are implicitly summed over {1,\dots,d} if they are repeated in an expression, and the stress {\Sigma^{(a)}(t)} on the particle {P^{(a)}} at time {t} is the rank {2}-tensor defined by the formula

\displaystyle \Sigma^{(a)}_{ij}(t) := \frac{1}{2} \sum_{b: b \neq a} F^{(ab)}_i(t) (x^{(b)}_j(t) - x^{(a)}_j(t)) \ \ \ \ \ (20)


where {F^{(ab)}_1,\dots,F^{(ab)}_d} denote the components of {F^{(ab)}}. Recall from the torque-free hypothesis (4) that {F^{(ab)}} is parallel to {x^{(b)}-x^{(a)}}, thus we could write {F^{(ab)}(t) = f^{(ab)}(t) (x^{(b)}(t)-x^{(a)}(t))} for some scalar {f^{(ab)}(t)}. Thus we have

\displaystyle \Sigma^{(a)}_{ij}(t) = \frac{1}{2} \sum_{b: b \neq a} f^{(ab)}(t) (x^{(b)}_i(t) - x^{(a)}_i(t)) (x^{(b)}_j(t) - x^{(a)}_j(t)).

In particular, we see that the torque-free hypothesis makes the stress tensor symmetric:

\displaystyle \Sigma^{(a)}_{ij} = \Sigma^{(a)}_{ji}.

To proceed further, we make the assumption (similar to (9)) that the stress tensor {\Sigma^{(a)}} (or more precisely, the measure {\sum_a \Sigma^{(a)} \delta_{x^{(a)}(t)}}) is approximated in the bulk by a smooth tensor field {\sigma: {\bf R} \times {\bf R}^d \rightarrow {\bf R}^d \otimes {\bf R}^d} (with components {\sigma_{ij}: {\bf R} \times {\bf R}^d \rightarrow {\bf R}} for {i,j=1,\dots,d}), in the sense that

\displaystyle \int_{\bf R} \sum_a \Sigma^{(a)}_{ij}(t) \psi(t,x^{(a)}(t))\ dt \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) \sigma_{ij}(t,x)\ dx dt. \ \ \ \ \ (21)


The tensor {\sigma_{ij}} is known as the Cauchy stress tensor. Since {\Sigma^{(a)}} is symmetric in {i,j}, the right-hand side of (21) is also symmetric in {i,j}, which by the arbitrariness of {\psi} implies that the tensor {\sigma} is symmetric also:

\displaystyle \sigma_{ij} = \sigma_{ji}.

This is also known as Cauchy’s second law of motion.

Inserting (19) into (21), we arrive at

\displaystyle \int_{\bf R} \int_{{\bf R}^d} (-\sigma_{ij} \partial_j \psi)(t, x)\ dt dx + \int_{\bf R} \int_{{\bf R}^d} (u_i D_t \psi)(t,x) \rho(t,x)\ dx dt \approx 0

for any test function {\psi} and {i=1,\dots,d}. Taking limits as {N \rightarrow \infty} we obtain the exact equation

\displaystyle \int_{\bf R} \int_{{\bf R}^d} (-\sigma_{ij} \partial_j \psi)(t, x)\ dt dx + \int_{\bf R} \int_{{\bf R}^d} (u_i D_t \psi)(t,x) \rho(t,x)\ dx dt = 0

and then integrating by parts we have

\displaystyle \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) (\partial_j \sigma_{ij} + D^*_t (\rho u_i))(t,x)\ dx dt = 0;

as the test function {\psi} is arbitrary, we conclude the Cauchy momentum equation

\displaystyle \partial_j \sigma_{ij} + D^*_t (\rho u_i) = 0.

From the Leibniz rule (in an adjoint form) we see that

\displaystyle D^*_t (\rho u_i) = (D^*_t \rho) u_i - \rho D_t( u_i);

using (14) we can thus also write the Cauchy momentum equation in the more conventional form

\displaystyle D_t u_i = \frac{1}{\rho} \partial_j \sigma_{ij}. \ \ \ \ \ (22)


(This is a dynamical version of Cauchy’s first law of motion.)

To summarise so far, the unknown density field {\rho} and velocity field {u} obey two equations of motion: the continuity equation (15) (or (14)) and the momentum equation (22). As the former is a scalar equation and the latter is a vector equation, this is {d+1} equations for {d+1} unknowns, which looks good – so long as the stress tensor {\sigma} is known. However, the stress tensor is not given to us in advance, and so further physical assumptions on the underlying fluid are needed to derive additional equations to yield a more complete set of equations of motion.

One of the simplest such assumptions is isotropy – that, in the vicinity of a given particle {P^{(a)}} at a given point in time, the distribution of the nearby particles {P^{(b)}} (and of the forces {F^{(b)}}) is effectively rotationally symmetric, in the sense that rotation of the fluid around that particle does not significantly affect the net stresses acting on the particle. To give more mathematical meaning to this assumption, let us fix {a} and {t}, and let us set {x^{(a)}(t)} to be the spatial origin {0} for simplicity. In particular the stress tensor {\Sigma^{(a)}_{ij}(t)} now simplifies a little to

\displaystyle \Sigma^{(a)}_{ij}(t) = \frac{1}{2} \sum_{b: b \neq a} f^{(ab)}(t) x^{(b)}_i(t) x^{(b)}_j(t);

viewing this tensor as a {d \times d} symmetric matrix, we can also write

\displaystyle \Sigma^{(a)}(t) = \frac{1}{2} \sum_{b: b \neq a} f^{(ab)}(t) x^{(b)}(t) x^{(b)}(t)^T

where we now think of the vector {x^{(b)}(t)} as a {d}-dimensional column vector, and {x^T} denotes the transpose of {x}.

Imagine that we rotate the fluid around this spatial origin using some rotation matrix {U \in SO(d)}, thus replacing {x^{(b)}(t)} with {U x^{(b)}(t)} (and hence {x^{(b)}(t)^T} is replaced with {x^{(b)}(t)^T U^T}). If we assume that interaction forces are rotationally symmetric, the interaction scalars {f^{(ab)}} should not be affected by this rotation. As such, {\Sigma^{(a)}(t)} would be replaced with {U \Sigma^{(a)}(t) U^T}. If we assume isotropy, though, this rotated fluid should generate the same stress as the original fluid, thus we have

\displaystyle \Sigma^{(a)}(t) =U \Sigma^{(a)}(t) U^T

for all rotation matrices {U}, that is to say that {\Sigma^{(a)}} commutes with all rotations. This implies that all eigenspaces of {\Sigma^{(a)}} are rotation-invariant, but the only rotation-invariant subspaces of {{\bf R}^d} are {\{0\}} and {{\bf R}^d}. Thus the spectral decomposition of the symmetric matrix {\Sigma^{(a)}} only involves a single eigenspace {{\bf R}^d}, or equivalently {\Sigma^{(a)}} is a multiple of the identity. In coordinates, we have

\displaystyle \Sigma^{(a)}_{ij}(t) = - p^{(a)}(t) \delta_{ij}

for some scalar {p^{(a)}} (known as the pressure exerted on the particle {P^{(a)}}), where {\delta_{ij}} denotes the Kronecker delta (the negative sign here is for compatibility with other physical definitions of pressure). Passing from the individual stresses {\Sigma^{(a)}} to the stress field {\sigma}, we see that {\sigma_{ij}} is also rotationally invariant and thus is also a multiple of the identity, thus

\displaystyle \sigma_{ij}(t,x) = -p(t,x) \delta_{ij} \ \ \ \ \ (23)


for some field {p: {\bf R} \times {\bf R}^d \rightarrow {\bf R}}, which we call the pressure field, and which we assume to be smooth. The equations (15), (22) now become the Euler equations

\displaystyle \partial_t u + (u \cdot \nabla) u = - \frac{1}{\rho} \nabla p \ \ \ \ \ (24)


\displaystyle \partial_t \rho + \nabla \cdot (\rho u) = 0. \ \ \ \ \ (25)


This is still an underdetermined system, being {d+1} equations for {d+2} unknowns (two scalar fields {\rho,p} and one vector field {u}). But if we assume incompressibility (normalising {\rho=1} for simplicity), we obtain the incompressible Euler equations

\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p \ \ \ \ \ (26)


\displaystyle \nabla \cdot u = 0.

Without incompressibility, one can still reach a determined system of equations if one postulates a relationship (known as an equation of state) between the pressure {p} and the density {\rho} (in some physical situations one also needs to introduce further thermodynamic variables, such as temperature or entropy, which also influence this relationship and obey their own equation of motion). Alternatively, one can proceed using an analysis of the energy conservation law, similar to how (15) arose from conservation of mass and (22) arose from conservation of momentum, though at the end of the day one would still need an equation of state connecting energy density to other thermodynamic variables. For details, see Exercise 4 below.

Now we consider relaxing the assumption of isotropy in the stress. Many fluids, in addition to experiencing isotropic stress coming from a scalar pressure field, also experience additional shear stress associated to strain in the fluid – distortion in the shape of the fluid arising from fluctuations in the velocity field. One can thus postulate a generalisation to (23) of the form

\displaystyle \sigma_{ij}(t,x) = -p(t,x) \delta_{ij} + \tau_{ij}( u(t,x), \nabla u(t,x) ) \ \ \ \ \ (27)


where {\tau_{ij}: {\bf R}^d \times {\bf R}^{d^2} \rightarrow {\bf R}} is some function of the velocity {u(t,x) \in {\bf R}^d} at the point {(t,x)} in spacetime, as well as the first derivatives {\nabla u = (\partial_i u_j)_{1 \leq i,j \leq d} \in {\bf R}^{d^2}}, that arises from changes in shape. (Here, we assume here that the response to stress is autonomous – it does not depend directly on time, location, or other statistics of the fluid, such as pressure, except insofar as those variables are related to the quantities {u(t,x)} and {\nabla u(t,x)}. One can of course consider more complex models in which there is a dependence on such quantities or on higher derivatives {\nabla^2 u, \nabla^3 u}, etc. of the velocity, but we will not do so here.)

It is a general principle in physics that functional relationships between physical quantities, such as the one in (27), can be very heavily constrained by requiring the relationship to be invariant with respect to various physically natural symmetries. This is certainly the case for (27). First of all, we can impose Galilean invariance: if one changes to a different inertial frame of reference, thus adding a constant vector {u_0} to the velocity field {u(t,x)} (and not affecting the gradient {\nabla u} at all), this should not actually introduce any new stresses on the fluid. This leads to the postulate

\displaystyle \tau_{ij}( u + u_0, \nabla u ) = \tau_{ij}( u, \nabla u )

for any {u_0}, and hence {\tau_{ij}} should not actually depend on the velocity {u(t,x)} and should only depend on the first derivative. Thus we now write {\tau_{ij}( \nabla u(t,x) )} for {\tau_{ij}(u(t,x), \nabla u(t,x))} (thus {\tau_{ij}} is now a function from {{\bf R}^{d^2}} to {{\bf R}}).

If {u=0} then there should be no additional shear stress, so we should have {\tau_{ij}(0) = 0}. We now make a key assumption that the fluid is a Newtonian fluid, in that the linear term in the Taylor expansion of {\tau} dominates, or in other words we assume that {\tau_{ij}} is a linear function of {\nabla u}. (One can certainly study the mechanics of non-Newtonian fluids as well, in which {\tau} depends nonlinearly on {\nabla u}, or even on past values of {\nabla u}, but these are no longer governed by the Navier-Stokes equations and will not be considered further here.) One can also think of {\tau} as a linear map from the space of {n \times n} matrices (which is the space where {\nabla u(t,x)} takes values) to the space of {n \times n} matrices. In coefficients, this means we are postulating a relationship of the form

\displaystyle \tau_{ij} = \mu_{ijkl} \partial_k u_l(t,x)

for some constants {\mu_{ijkl}} (recall we are using the Einstein summation conventions). This looks like a lot of unspecified constants, but again we can use physical principles to impose significant constraints. Firstly, because stress is symmetric in {i} and {j}, the coefficients {\mu_{ijkl}} must also be symmetric in {i} and {j}: {\mu_{ijkl} = \mu_{jikl}}. Next, let us for simplicity set {x} to be the spatial origin {x=0}, and consider a rotating velocity field of the form

\displaystyle u(t,x) = A x

for some constant-coefficient anti-symmetric matrix {A}, or in coordinates

\displaystyle u_i(t,x) = A_{ij} x_j.

The derivative field {\partial_i u_j} is then just the anti-symmetric {A_{ji} = -A_{ij}}. This corresponds to fluids moving according to the rotating trajectories {x(t) = \exp(tA) x(0)}, where {\exp} denotes the matrix exponential. (For instance, in two dimensions, the velocity field {u(t, x_1,x_2) = (-x_2, x_1)} gives rise to trajectories {x_1(t) = \cos(t) x_1(0) - \sin(t) x_2(0)}, {x_2(t) = \sin(t) x_1(0) + \cos(t) x_2(0)} corresponding to counter-clockwise rotation around the origin.)

Exercise 2 When {A} is anti-symmetric, show that the matrix {\exp( tA )} is orthogonal for all {t}, thus

\displaystyle \exp( tA ) \exp( tA )^T = \exp(tA)^T \exp(tA) = 1

for all {t \in {\bf R}}, where {1} denotes the identity matrix. (Hint: differentiate the expressions appearing in the above equation with respect to time.) Also show that {\exp(tA)} is a rotation matrix, that is to say an orthogonal matrix of determinant {+1}.

As {\exp(tA)} is orthogonal, it describes a rigid motion. Thus this rotating motion does not change the shape of the fluid, and so should not give rise to any shear stress. That is to say, the linear map {\tau} should vanish when applied to an anti-symmetric matrix, or in coordinates {\mu_{ijkl} = \mu_{ijlk}}. Another way of saying this is that {\tau(\nabla u)} only depends on {\nabla u} through its symmetric component {\frac{1}{2} (\nabla u + (\nabla u)^T)}, known as the rate-of-strain tensor (or the deformation tensor). (The anti-symmetric part {\frac{1}{2} (\nabla u - (\nabla u)^T)} does not cause any strain, but instead measures the infinitesimal rotation the fluid; up to trivial factors such as {\frac{1}{2}}, it is essentially the vorticity of the fluid, which will be an important field to study in subsequent notes.)

To constrain the behaviour of {\tau} further, we introduce a hypothesis of rotational (and reflectional) symmetry. If one rotates the fluid by a rotation matrix {U \in SO(d)} around the origin, then if the original fluid had a velocity of {u(t,x)} at {(t,x)}, the new fluid should have a velocity of {U u(t,x)} at {(t,Ux)}, thus the new velocity field {\tilde u} is given by the formula

\displaystyle \tilde u(t,x) = U u(t, U^T x)

and the derivative {\nabla \tilde U} of this velocity field at the origin {0} is then related to the original derivative {\nabla u} by the formula

\displaystyle \nabla \tilde u(t,0) = U \nabla u(u, 0) U^T.

Meanwhile, as discussed in the analysis of the isotropic case, the new stress {\tilde \tau} at the origin {0} is related to the original stress {\tau} by the same relation:

\displaystyle \tilde \tau = U \tau U^T.

This means that the linear map {\tau} is rotationally equivariant in the sense that

\displaystyle \tau( U A U^T ) = U \tau(A) U^T \ \ \ \ \ (28)


for any matrix {A}. Actually the same argument also applies for reflections, so one could also take {U} in the orthogonal group {O(d)} rather than the special orthogonal group.

This severely constrains the possible behaviour of {\tau}. First consider applying {\tau} to the rank {1} matrix {e_1 e_1^T}, where {e_1} is the first basis (column) vector of {{\bf R}^d}. The equivariant property (28) then implies that {\tau(e_1 e_1^T)} is invariant with respect to any rotation or reflection of the remaining coordinates {x_2,\dots,x_d}. As in the isotropic analysis, this implies that the lower right {d-1 \times d-1} minor of {\tau(e_1 e_1^T)} is a multiple of the identity; when {d \geq 3} it also implies that the upper right entries {\tau_{1i}(e_1 e_1^T)} or lower left entries {\tau_{i1}(e_1 e_1^T)} for {i \neq 1} also vanish (one can also obtain this by applying (28) with {U} the reflection in the {x_1} variable). Thus we have

\displaystyle \tau_{ij}(e_1 e_1^T) = 2 \mu \delta_{1i} \delta_{1j} + \lambda \delta_{ij}

for some constant scalars {\mu,\lambda} (known as the dynamic viscosity and second viscosity respectively); in matrix form we have

\displaystyle \tau(e_1 e_1^T) = 2 \mu e_1 e_1^T + \lambda I

where {I} is the identity matrix. Applying equivariance, we conclude that

\displaystyle \tau(v v^T) = 2 \mu v v^T + \lambda I

for any unit vector {v}; applying the spectral theorem this implies that

\displaystyle \tau(A) = 2 \mu A + \lambda \mathrm{tr}(A) I

for any symmetric matrix {A}. Since {\tau(A)} is already known to vanish for non-symmetric matrices, upon decomposing a general matrix {A} into the symmetric part {\frac{1}{2} (A + A^T)} and anti-symmetric part {\frac{1}{2} (A - A^T)} (with the latter having trace zero) we conclude that

\displaystyle \tau(A) = \mu (A + A^T) + \lambda \mathrm{tr}(A) I

for an arbitrary matrix {A}. In particular

\displaystyle \tau(\nabla u) = \mu ( \nabla u + (\nabla u)^T ) + \lambda (\nabla \cdot u) I.

In the incompressible case {\nabla \cdot u = 0}, the second term vanishes, and this equation simply says that the shear stress is proportional to the (rate of) strain.

Inserting the above law back into (27) and then (15), (22), we obtain the Navier-Stokes equations

\displaystyle \partial_t u + (u \cdot \nabla) u = - \frac{1}{\rho} \nabla p + \frac{\mu}{\rho} (\Delta u + \nabla (\nabla \cdot u) ) + \frac{\lambda}{\rho} \nabla(\nabla \cdot u)

\displaystyle \partial_t \rho + \nabla \cdot (\rho u) = 0,

where {\Delta u = \partial_i \partial_i u} is the spatial Laplacian. In the incompressible case, setting {\nu := \frac{\mu}{\rho}} (this ratio is known as the kinematic viscosity), and also normalising {\rho=1} for simplicity, this simplifies to the incompressible Navier-Stokes equations

\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p + \nu \Delta u

\displaystyle \nabla \cdot u = 0.

Of course, the incompressible Euler equations arise as the special case when the viscosity {\nu} is set to zero. For physical fluids, {\nu} is positive, though it can be so small that the Euler equations serve as a good approximation. Negative values of {\mu} are mathematically possible, but physically unrealistic for several reasons (for instance, the total energy of the fluid would increase over time, rather than dissipate over time) and also the equations become mathematically quite ill-behaved in this case (as they carry essentially all of the pathologies of the backwards heat equation).

Exercise 3 In a constant gravity field oriented in the direction {-e_d}, each particle {P^{(a)}} will experience an external gravitational force {- m^{(a)} g e_d}, where {g>0} is a fixed constant. Argue that the incompressible Euler equations in the presence of such a gravitational field should be modified to

\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p - g e_d

\displaystyle \nabla \cdot u = 0

and the incompressible Navier-Stokes equations should similarly be modified to

\displaystyle \partial_t u + (u \cdot \nabla) u = - \nabla p + \nu \Delta u - g e_d

\displaystyle \nabla \cdot u = 0.

Exercise 4

  • (i) Suppose that in an {N}-particle system, the force {F^{(ab)}(t)} between two particles is given by the conservative law

    \displaystyle F^{(ab)}(t) = - (\nabla V^{(ab)})(x^{(a)}(t) - x^{(b)}(t)) \ \ \ \ \ (29)


    for some smooth potential function {V^{(ab)}: {\bf R}^d \rightarrow {\bf R}}, which we assume to obey the symmetry {V^{(ab)}(x) = V^{(ba)}(-x)} to be consistent with Newton’s third law. Show that the total energy

    \displaystyle \sum_a \frac{1}{2} m^{(a)} |\frac{d}{dt} x^{(a)}(t)|^2 + \frac{1}{2} \sum_{a,b: a \neq b} V^{(ab)}( x^{(a)}(t) - x^{(b)}(t) )

    is conserved in time (that is to say, the time derivative of this quantity vanishes). Here of course we use {|v| := \sqrt{v \cdot v}} to denote the Euclidean magnitude of a vector {v}.

  • (ii) Assume furthermore that in the large {N} limit, we have the velocity approximation (10), the stress approximation (21), and the isotropy condition (23). Assume all interactions are short-range. Assume we also have a potential energy approximation

    \displaystyle \int_{\bf R} (\frac{1}{2} \sum_{a,b: a\neq b} V^{(ab)}(x^{(a)}(t) - x^{(b)}(t))) \psi(t,x^{(a)}(t))\ dt \ \ \ \ \ (30)


    \displaystyle \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) \rho(t,x) e(t,x)\ dx dt

    for all test functions {\psi \in C^\infty_c( {\bf R} \times {\bf R}^d )} and some smooth function {e: {\bf R} \times {\bf R}^d \rightarrow {\bf R}} (usually called the specific internal energy). By analysing the energy distribution

    \displaystyle \mu_{\mathrm{energy}}(t) := \sum_a (\frac{1}{2} m^{(a)} |\frac{d}{dt} x^{(a)}(t)|^2

    \displaystyle + \frac{1}{2} \sum_{b: b \neq a} V^{(ab)}( x^{(a)}(t) - x^{(b)}(t) )) \delta_{x^{(a)}(t)}

    in analogy with the previous analysis of the mass distribution {\mu_{\mathrm{mass}}(t)} and momentum distribution {\mu_{\mathrm{momentum}}(t)}, make a heuristic derivation of the energy conservation law

    \displaystyle D_t^* E - \nabla \cdot (p u) = 0

    or equivalently

    \displaystyle \partial_t E + \nabla \cdot ((E+p) u) = 0 \ \ \ \ \ (31)


    where the energy density {E(t,x)} is defined by the formula

    \displaystyle E(t,x) := \frac{1}{2} \rho(t,x) |u(t,x)|^2 + \rho(t,x) e(t,x) \ \ \ \ \ (32)


    (that is to say, {E(t,x)} is the sum of the kinetic energy density {\frac{1}{2} \rho(t,x) |u(t,x)|^2} and the internal energy density {\rho(t,x) e(t,x)}). Conclude in particular that the total energy

    \displaystyle \int_{{\bf R}^d} E(t,x)\ dx

    is formally conserved in time.

  • (iii) In addition to the previous assumptions, assume that there is a functional relationship between the specific energy density {e(t,x)} and the density {\rho(t,x)}, in the sense that there is a smooth function {F: {\bf R} \rightarrow {\bf R}} such that

    \displaystyle e(t,x) = F( \rho(t,x) ). \ \ \ \ \ (33)


    at all points in spacetime and all physically relevant configurations of particles. (In particular, we are in the compressible situation in which we allow the density to vary.) By dilating the position variables {x^{(a)}} by a scaling factor {\lambda > 0}, scaling the test function {\psi} appropriately, and working out (via (9)) what the scaling does to the density function {\rho(t,x)}, generalise (30) to

    \displaystyle \int_{\bf R} (\frac{1}{2} \sum_{a,b: a a\neq b} V^{(ab)}(\lambda(x^{(a)}(t) - x^{(b)}(t)))) \psi(t,x^{(a)}(t))\ dt

    \displaystyle \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) \rho(t,x) F(\lambda^{-d} \rho(t,x))\ dx dt.

    Formally differentiate this approximation with respect to {\lambda} at {\lambda = 1} and use (29), (20), (21), (23) to heuristically derive the equation of state

    \displaystyle p(t,x) = \rho^2(t,x) F'( \rho(t,x) ). \ \ \ \ \ (34)


  • (iv) Give an alternate derivation of the energy conservation law (31) from the equations (24), (25), (33), (34). (Note that as the fluid here is compressible, one cannot use any equations in this post that rely on an incompressibility hypothesis.)
  • (v) Suppose the functional relationship (33) uses a function {F} of the form

    \displaystyle F(\rho) = e_0 + A (\rho-1)^2

    for some constant {e_0} (the specific energy density at equilibrium) and some large and positive {A}; physically, this represents a fluid which is nearly incompressible in the sense that it is very resistant to having its density {\rho} deviate from {1}. Assuming that the pressure {p} stays bounded and {A} is large, heuristically derive the approximations

    \displaystyle \rho \approx 1 + \frac{p}{2A}


    \displaystyle e \approx e_0 + \frac{p^2}{4A}.

    Formally, in the limit {A \rightarrow \infty}, we thus heuristically recover an incompressible fluid {\rho=1} with a constant specific energy density {e = e_0}. In particular the contributions of the specific energy {e = e_0} to the energy conservation law (31) may be ignored in the incompressible case thanks to (24).

Remark 5 In the literature, the relationship between the functional relationship (33) and the equation of state (34) is usually derived instead using the laws of thermodynamics. However, as the above exercise demonstrates, it is also possible to recover this relationship from first principles. In the case of an (isoentropic) ideal gas, the laws of thermodynamics can be used to establish an equation of state of the form {p = C \rho^\gamma} for some constants {C,\gamma} with {\gamma \neq 1}, as well as the corresponding functional relationship {e = \frac{1}{\gamma - 1} \rho^{\gamma-1} = \frac{p}{(\gamma-1) \rho}}, so that the internal energy density is {\frac{p}{\gamma-1}}. This is of course consistent with (33) and (34), after choosing {F} appropriately.

Exercise 6 Suppose one has a (compressible or incompressible) fluid obeying the velocity approximation (10), the stress approximation (21), the isotropy condition (23), and the torque free condition (4). Assume also that all interactions are short range. Derive the angular momentum equation

\displaystyle D_t ( x_i u_j - x_j u_i ) = \frac{1}{\rho} ( \partial_i (x_j p) - \partial_j (x_i p ) )

for {i,j= 1,\dots,d} in two different ways:

  • (a) From a heuristic analysis of the angular momentum distribution

    \displaystyle \mu_{\mathrm{ang}, ij} := \sum_a (\frac{1}{2} m^{(a)} ( x^{(a)}_i \frac{d}{dt} x^{(a)}_j(t) - x^{(a)}_j \frac{d}{dt} x^{(a)}_i(t)) \delta_{x^{(a)}(t)}

    analogous to how the mass, momentum, and energy distributions were analysed previously; and

  • (b) Directly from the system (24), (25).

(c) Conclude in particular that the total angular momentum

\displaystyle \int_{{\bf R}^d} (x_i u_j(t,x) - x_j u_i(t,x)) \rho(t,x)\ dx

is formally conserved in time.

Remark 7 In this set of notes we made the rather strong assumption (10) that the velocities of particles could be approximated by a smooth function {u(t,x)} of their time and position. In practice, most fluids will violate this hypothesis due to thermal fluctuations in the velocity. However, one can still proceed with a similar derivation assuming that the velocities behave on the average like a smooth function {u(t,x)}, in the sense that

\displaystyle \int_{\bf R} \sum_a m^{(a)} (\frac{d}{dt} x^{(a)}(t)) \psi(t,x^{(a)}(t))\ dt \ \ \ \ \ (35)


\displaystyle \approx \int_{\bf R} \int_{{\bf R}^d} \psi(t,x) u(t,x) \rho(t,x)\ dx dt

for any test function {\rho}. The approximation (13) now has to be replaced by the more general

\displaystyle \frac{d}{dt} \psi(t,x^{(a)}(t)) \approx (D_t \psi)(t,x^{(a)}(t)) + w^{(a)}(t) \cdot \nabla \psi(t, x^{(a)}(t))


\displaystyle w^{(a)}(t) := \frac{d}{dt} x^{(a)}(t) - u(t, x^{(a)}(t))

is the deviation of the particle velocity from its mean. The second term in (18) now needs to be replaced by the more complicated expression

\displaystyle \int_{\bf R} \sum_a m^{(a)} (u(t,x^{(a)}(t)) + w^{(a)}(t)) \times \ \ \ \ \ (36)


\displaystyle \times (D_t \psi(t, x^{(a)}(t)) + w^{(a)}(t) \cdot \nabla \psi(t, x^{(a)}(t))\ dt.

From (35), (9) one has

\displaystyle \int_{\bf R} \sum_a m^{(a)} w^{(a)}(t) \psi(t,x^{(a)}(t))\ dt \approx 0

for any test function {\psi}. This allows us to heuristically drop the cross terms from (36) involving a single factor of {w^{(a)}}, and simplify this expression (up to negligible errors) as

\displaystyle \int_{\bf R} \sum_a m^{(a)} D_t \psi(t, x^{(a)}(t)\ dt

\displaystyle + \int_{\bf R} \sum_a m^{(a)} w^{(a)}(t) (w^{(a)}(t) \cdot \nabla) \psi(t, x^{(a)}(t))\ dt.

Repeating the analysis after (18), one eventually arrives again at (19), except that one has to add an additional term

\displaystyle - w^{(a)}_i(t) w^{(a)}_j(t)

to the stress tensor {\Sigma^{(a)}_{ij}(t)} of a single particle {P^{(a)}}. However, this term is still symmetric, and one can still continue most of the heuristic analysis in this post after suitable adjustments to the various physical hypotheses (for instance, assuming some form of the molecular chaos hypothesis to be able to neglect some correlation terms between {w^{(a)}} and other quantities). We leave the details to the interested reader.