You are currently browsing the category archive for the ‘math.MP’ category.

This post is an unofficial sequel to one of my first blog posts from 2007, which was entitled “Quantum mechanics and Tomb Raider“.

One of the oldest and most famous allegories is Plato’s allegory of the cave. This allegory centers around a group of people chained to a wall in a cave that cannot see themselves or each other, but only the two-dimensional shadows of themselves cast on the wall in front of them by some light source they cannot directly see. Because of this, they identify reality with this two-dimensional representation, and have significant conceptual difficulties in trying to view themselves (or the world as a whole) as three-dimensional, until they are freed from the cave and able to venture into the sunlight.

There is a similar conceptual difficulty when trying to understand Einstein’s theory of special relativity (and more so for general relativity, but let us focus on special relativity for now). We are very much accustomed to thinking of reality as a three-dimensional space endowed with a Euclidean geometry that we traverse through in time, but in order to have the clearest view of the universe of special relativity it is better to think of reality instead as a four-dimensional spacetime that is endowed instead with a Minkowski geometry, which mathematically is similar to a (four-dimensional) Euclidean space but with a crucial change of sign in the underlying metric. Indeed, whereas the distance between two points in Euclidean space is given by the three-dimensional Pythagorean theorem

under some standard Cartesian coordinate system of that space, and the distance in a four-dimensional Euclidean space would be similarly given by under a standard four-dimensional Cartesian coordinate system , the spacetime interval in Minkowski space is given by (though in many texts the opposite sign convention is preferred) in spacetime coordinates , where is the speed of light. The geometry of Minkowski space is then quite similar algebraically to the geometry of Euclidean space (with the sign change replacing the traditional trigonometric functions , etc. by their hyperbolic counterparts , and with various factors involving “” inserted in the formulae), but also has some qualitative differences to Euclidean space, most notably a causality structure connected to light cones that has no obvious counterpart in Euclidean space.That said, the analogy between Minkowski space and four-dimensional Euclidean space is strong enough that it serves as a useful conceptual aid when first learning special relativity; for instance the excellent introductory text “Spacetime physics” by Taylor and Wheeler very much adopts this view. On the other hand, this analogy doesn’t directly address the conceptual problem mentioned earlier of viewing reality as a four-dimensional spacetime in the first place, rather than as a three-dimensional space that objects move around in as time progresses. Of course, part of the issue is that we aren’t good at directly visualizing four dimensions in the first place. This latter problem can at least be easily addressed by removing one or two spatial dimensions from this framework – and indeed many relativity texts start with the simplified setting of only having one spatial dimension, so that spacetime becomes two-dimensional and can be depicted with relative ease by spacetime diagrams – but still there is conceptual resistance to the idea of treating time as another spatial dimension, since we clearly cannot “move around” in time as freely as we can in space, nor do we seem able to easily “rotate” between the spatial and temporal axes, the way that we can between the three coordinate axes of Euclidean space.

With this in mind, I thought it might be worth attempting a Plato-type allegory to reconcile the spatial and spacetime views of reality, in a way that can be used to describe (analogues of) some of the less intuitive features of relativity, such as time dilation, length contraction, and the relativity of simultaneity. I have (somewhat whimsically) decided to place this allegory in a Tolkienesque fantasy world (similarly to how my previous allegory to describe quantum mechanics was phrased in a world based on the computer game “Tomb Raider”). This is something of an experiment, and (like any other analogy) the allegory will not be able to perfectly capture every aspect of the phenomenon it is trying to represent, so any feedback to improve the allegory would be appreciated.

Let be a divergence-free vector field, thus , which we interpret as a velocity field. In this post we will proceed formally, largely ignoring the analytic issues of whether the fields in question have sufficient regularity and decay to justify the calculations. The vorticity field is then defined as the curl of the velocity:

(From a differential geometry viewpoint, it would be more accurate (especially in other dimensions than three) to define the vorticity as the exterior derivative of the musical isomorphism of the Euclidean metric applied to the velocity field ; see these previous lecture notes. However, we will not need this geometric formalism in this post.)

Assuming suitable regularity and decay hypotheses of the velocity field , it is possible to recover the velocity from the vorticity as follows. From the general vector identity applied to the velocity field , we see that

and thus (by the commutativity of all the differential operators involved)

Using the Newton potential formula

and formally differentiating under the integral sign, we obtain the Biot-Savart law

This law is of fundamental importance in the study of incompressible fluid equations, such as the Euler equations

since on applying the curl operator one obtains the vorticity equation

and then by substituting (1) one gets an autonomous equation for the vorticity field . Unfortunately, this equation is non-local, due to the integration present in (1).

In a recent work, it was observed by Elgindi that in a certain regime, the Biot-Savart law can be approximated by a more “low rank” law, which makes the non-local effects significantly simpler in nature. This simplification was carried out in spherical coordinates, and hinged on a study of the invertibility properties of a certain second order linear differential operator in the latitude variable ; however in this post I would like to observe that the approximation can also be seen directly in Cartesian coordinates from the classical Biot-Savart law (1). As a consequence one can also initiate the beginning of Elgindi’s analysis in constructing somewhat regular solutions to the Euler equations that exhibit self-similar blowup in finite time, though I have not attempted to execute the entirety of the analysis in this setting.

Elgindi’s approximation applies under the following hypotheses:

- (i) (Axial symmetry without swirl) The velocity field is assumed to take the form
for some functions of the cylindrical radial variable and the vertical coordinate . As a consequence, the vorticity field takes the form

- (ii) (Odd symmetry) We assume that and , so that .

A model example of a divergence-free vector field obeying these properties (but without good decay at infinity) is the linear vector field

which is of the form (3) with and . The associated vorticity vanishes.

We can now give an illustration of Elgindi’s approximation:

Proposition 1 (Elgindi’s approximation)Under the above hypotheses (and assuing suitable regularity and decay), we have the pointwise boundsfor any , where is the vector field (5), and is the scalar function

Thus under the hypotheses (i), (ii), and assuming that is slowly varying, we expect to behave like the linear vector field modulated by a radial scalar function. In applications one needs to control the error in various function spaces instead of pointwise, and with similarly controlled in other function space norms than the norm, but this proposition already gives a flavour of the approximation. If one uses spherical coordinates

then we have (using the spherical change of variables formula and the odd nature of )

where

is the operator introduced in Elgindi’s paper.

*Proof:* By a limiting argument we may assume that is non-zero, and we may normalise . From the triangle inequality we have

and hence by (1)

In the regime we may perform the Taylor expansion

Since

we see from the triangle inequality that the error term contributes to . We thus have

where is the constant term

and are the linear term

By the hypotheses (i), (ii), we have the symmetries

The even symmetry (8) ensures that the integrand in is odd, so vanishes. The symmetry (6) or (7) similarly ensures that , so vanishes. Since , we conclude that

Using (4), the right-hand side is

where . Because of the odd nature of , only those terms with one factor of give a non-vanishing contribution to the integral. Using the rotation symmetry we also see that any term with a factor of also vanishes. We can thus simplify the above expression as

Using the rotation symmetry again, we see that the term in the first component can be replaced by or by , and similarly for the term in the second component. Thus the above expression is

giving the claim.

Example 2Consider the divergence-free vector field , where the vector potential takes the formfor some bump function supported in . We can then calculate

and

In particular the hypotheses (i), (ii) are satisfied with

One can then calculate

If we take the specific choice

where is a fixed bump function supported some interval and is a small parameter (so that is spread out over the range ), then we see that

(with implied constants allowed to depend on ),

and

which is completely consistent with Proposition 1.

One can use this approximation to extract a plausible ansatz for a self-similar blowup to the Euler equations. We let be a small parameter and let be a time-dependent vorticity field obeying (i), (ii) of the form

where and is a smooth field to be chosen later. Admittedly the signum function is not smooth at , but let us ignore this issue for now (to rigorously make an ansatz one will have to smooth out this function a little bit; Elgindi uses the choice , where ). With this ansatz one may compute

By Proposition 1, we thus expect to have the approximation

We insert this into the vorticity equation (2). The transport term will be expected to be negligible because , and hence , is slowly varying (the discontinuity of will not be encountered because the vector field is parallel to this singularity). The modulating function is similarly slowly varying, so derivatives falling on this function should be lower order. Neglecting such terms, we arrive at the approximation

and so in the limit we expect obtain a simple model equation for the evolution of the vorticity envelope :

If we write for the logarithmic primitive of , then we have and hence

which integrates to the Ricatti equation

which can be explicitly solved as

where is any function of that one pleases. (In Elgindi’s work a time dilation is used to remove the unsightly factor of appearing here in the denominator.) If for instance we set , we obtain the self-similar solution

and then on applying

Thus, we expect to be able to construct a self-similar blowup to the Euler equations with a vorticity field approximately behaving like

and velocity field behaving like

In particular, would be expected to be of regularity (and smooth away from the origin), and blows up in (say) norm at time , and one has the self-similarity

and

A self-similar solution of this approximate shape is in fact constructed rigorously in Elgindi’s paper (using spherical coordinates instead of the Cartesian approach adopted here), using a nonlinear stability analysis of the above ansatz. It seems plausible that one could also carry out this stability analysis using this Cartesian coordinate approach, although I have not tried to do this in detail.

I was recently asked to contribute a short comment to Nature Reviews Physics, as part of a series of articles on fluid dynamics on the occasion of the 200th anniversary (this August) of the birthday of George Stokes. My contribution is now online as “Searching for singularities in the Navier–Stokes equations“, where I discuss the global regularity problem for Navier-Stokes and my thoughts on how one could try to construct a solution that blows up in finite time via an approximately discretely self-similar “fluid computer”. (The rest of the series does not currently seem to be available online, but I expect they will become so shortly.)

This coming fall quarter, I am teaching a class on topics in the mathematical theory of incompressible fluid equations, focusing particularly on the incompressible Euler and Navier-Stokes equations. These two equations are by no means the only equations used to model fluids, but I will focus on these two equations in this course to narrow the focus down to something manageable. I have not fully decided on the choice of topics to cover in this course, but I would probably begin with some core topics such as local well-posedness theory and blowup criteria, conservation laws, and construction of weak solutions, then move on to some topics such as boundary layers and the Prandtl equations, the Euler-Poincare-Arnold interpretation of the Euler equations as an infinite dimensional geodesic flow, and some discussion of the Onsager conjecture. I will probably also continue to more advanced and recent topics in the winter quarter.

In this initial set of notes, we begin by reviewing the physical derivation of the Euler and Navier-Stokes equations from the first principles of Newtonian mechanics, and specifically from Newton’s famous three laws of motion. Strictly speaking, this derivation is not needed for the mathematical analysis of these equations, which can be viewed if one wishes as an arbitrarily chosen system of partial differential equations without any physical motivation; however, I feel that the derivation sheds some insight and intuition on these equations, and is also worth knowing on purely intellectual grounds regardless of its mathematical consequences. I also find it instructive to actually see the journey from Newton’s law

to the seemingly rather different-looking law

for incompressible Navier-Stokes (or, if one drops the viscosity term , the Euler equations).

Our discussion in this set of notes is physical rather than mathematical, and so we will not be working at mathematical levels of rigour and precision. In particular we will be fairly casual about interchanging summations, limits, and integrals, we will manipulate approximate identities as if they were exact identities (e.g., by differentiating both sides of the approximate identity), and we will not attempt to verify any regularity or convergence hypotheses in the expressions being manipulated. (The same holds for the exercises in this text, which also do not need to be justified at mathematical levels of rigour.) Of course, once we resume the mathematical portion of this course in subsequent notes, such issues will be an important focus of careful attention. This is a basic division of labour in mathematical modeling: non-rigorous heuristic reasoning is used to derive a mathematical model from physical (or other “real-life”) principles, but once a precise model is obtained, the analysis of that model should be completely rigorous if at all possible (even if this requires applying the model to regimes which do not correspond to the original physical motivation of that model). See the discussion by John Ball quoted at the end of these slides of Gero Friesecke for an expansion of these points.

Note: our treatment here will differ slightly from that presented in many fluid mechanics texts, in that it will emphasise first-principles derivations from many-particle systems, rather than relying on bulk laws of physics, such as the laws of thermodynamics, which we will not cover here. (However, the derivations from bulk laws tend to be more robust, in that they are not as reliant on assumptions about the particular interactions between particles. In particular, the physical hypotheses we assume in this post are probably quite a bit stronger than the minimal assumptions needed to justify the Euler or Navier-Stokes equations, which can hold even in situations in which one or more of the hypotheses assumed here break down.)

The 2014 Fields medallists have just been announced as (in alphabetical order of surname) Artur Avila, Manjul Bhargava, Martin Hairer, and Maryam Mirzakhani (see also these nice video profiles for the winners, which is a new initiative of the IMU and the Simons foundation). This time four years ago, I wrote a blog post discussing one result from each of the 2010 medallists; I thought I would try to repeat the exercise here, although the work of the medallists this time around is a little bit further away from my own direct area of expertise than last time, and so my discussion will unfortunately be a bit superficial (and possibly not completely accurate) in places. As before, I am picking these results based on my own idiosyncratic tastes, and they should not be viewed as necessarily being the “best” work of these medallists. (See also the press releases for Avila, Bhargava, Hairer, and Mirzakhani.)

Artur Avila works in dynamical systems and in the study of Schrödinger operators. The work of Avila that I am most familiar with is his solution with Svetlana Jitormiskaya of the *ten martini problem* of Kac, the solution to which (according to Barry Simon) he offered ten martinis for, hence the name. (The problem had also been previously posed in the work of Azbel and of Hofstadter.) The problem involves perhaps the simplest example of a Schrödinger operator with non-trivial spectral properties, namely the almost Mathieu operator defined for parameters and by a discrete one-dimensional Schrödinger operator with cosine potential:

This is a bounded self-adjoint operator and thus has a spectrum that is a compact subset of the real line; it arises in a number of physical contexts, most notably in the theory of the integer quantum Hall effect, though I will not discuss these applications here. Remarkably, the structure of this spectrum depends crucially on the Diophantine properties of the frequency . For instance, if is a rational number, then the operator is periodic with period , and then basic (discrete) Floquet theory tells us that the spectrum is simply the union of (possibly touching) intervals. But for irrational (in which case the spectrum is independent of the phase ), the situation is much more fractal in nature, for instance in the critical case the spectrum (as a function of ) gives rise to the Hofstadter butterfly. The “ten martini problem” asserts that for *every* irrational and every choice of coupling constant , the spectrum is homeomorphic to a Cantor set. Prior to the work of Avila and Jitormiskaya, there were a number of partial results on this problem, notably the result of Puig establishing Cantor spectrum for a full measure set of parameters , as well as results requiring a perturbative hypothesis, such as being very small or very large. The result was also already known for being either very close to rational (i.e. a Liouville number) or very far from rational (a Diophantine number), although the analyses for these two cases failed to meet in the middle, leaving some cases untreated. The argument uses a wide variety of existing techniques, both perturbative and non-perturbative, to attack this problem, as well as an amusing argument by contradiction: they assume (in certain regimes) that the spectrum *fails* to be a Cantor set, and use this hypothesis to obtain additional Lipschitz control on the spectrum (as a function of the frequency ), which they can then use (after much effort) to improve existing arguments and conclude that the spectrum was in fact Cantor after all!

Manjul Bhargava produces amazingly beautiful mathematics, though most of it is outside of my own area of expertise. One part of his work that touches on an area of my own interest (namely, random matrix theory) is his ongoing work with many co-authors on modeling (both conjecturally and rigorously) the statistics of various key number-theoretic features of elliptic curves (such as their rank, their Selmer group, or their Tate-Shafarevich groups). For instance, with Kane, Lenstra, Poonen, and Rains, Manjul has proposed a very general random matrix model that predicts all of these statistics (for instance, predicting that the -component of the Tate-Shafarevich group is distributed like the cokernel of a certain random -adic matrix, very much in the spirit of the Cohen-Lenstra heuristics discussed in this previous post). But what is even more impressive is that Manjul and his coauthors have been able to *verify* several non-trivial fragments of this model (e.g. showing that certain moments have the predicted asymptotics), giving for the first time non-trivial upper and lower bounds for various statistics, for instance obtaining lower bounds on how often an elliptic curve has rank or rank , leading most recently (in combination with existing work of Gross-Zagier and of Kolyvagin, among others) to his amazing result with Skinner and Zhang that at least of all elliptic curves over (ordered by height) obey the Birch and Swinnerton-Dyer conjecture. Previously it was not even known that a positive proportion of curves obeyed the conjecture. This is still a fair ways from resolving the conjecture fully (in particular, the situation with the presumably small number of curves of rank and higher is still very poorly understood, and the theory of Gross-Zagier and Kolyvagin that this work relies on, which was initially only available for , has only been extended to totally real number fields thus far, by the work of Zhang), but it certainly does provide hope that the conjecture could be within reach in a statistical sense at least.

Martin Hairer works in at the interface between probability and partial differential equations, and in particular in the theory of stochastic differential equations (SDEs). The result of his that is closest to my own interests is his remarkable demonstration with Jonathan Mattingly of unique invariant measure for the two-dimensional stochastically forced Navier-Stokes equation

on the two-torus , where is a Gaussian field that forces a fixed set of frequencies. It is expected that for any reasonable choice of initial data, the solution to this equation should asymptotically be distributed according to Kolmogorov’s power law, as discussed in this previous post. This is still far from established rigorously (although there are some results in this direction for dyadic models, see e.g. this paper of Cheskidov, Shvydkoy, and Friedlander). However, Hairer and Mattingly were able to show that there was a unique probability distribution to almost every initial data would converge to asymptotically; by the ergodic theorem, this is equivalent to demonstrating the existence and uniqueness of an invariant measure for the flow. Existence can be established using standard methods, but uniqueness is much more difficult. One of the standard routes to uniqueness is to establish a “strong Feller property” that enforces some continuity on the transition operators; among other things, this would mean that two ergodic probability measures with intersecting supports would in fact have a non-trivial common component, contradicting the ergodic theorem (which forces different ergodic measures to be mutually singular). Since all ergodic measures for Navier-Stokes can be seen to contain the origin in their support, this would give uniqueness. Unfortunately, the strong Feller property is unlikely to hold in the infinite-dimensional phase space for Navier-Stokes; but Hairer and Mattingly develop a clean abstract substitute for this property, which they call the *asymptotic strong Feller* property, which is again a regularity property on the transition operator; this in turn is then demonstrated by a careful application of Malliavin calculus.

Maryam Mirzakhani has mostly focused on the geometry and dynamics of Teichmuller-type moduli spaces, such as the moduli space of Riemann surfaces with a fixed genus and a fixed number of cusps (or with a fixed number of boundaries that are geodesics of a prescribed length). These spaces have an incredibly rich structure, ranging from geometric structure (such as the Kahler geometry given by the Weil-Petersson metric), to dynamical structure (through the action of the mapping class group on this and related spaces), to algebraic structure (viewing these spaces as algebraic varieties), and are thus connected to many other objects of interest in geometry and dynamics. For instance, by developing a new recursive formula for the Weil-Petersson volume of this space, Mirzakhani was able to asymptotically count the number of *simple* prime geodesics of length up to some threshold in a hyperbolic surface (or more precisely, she obtained asymptotics for the number of such geodesics in a given orbit of the mapping class group); the answer turns out to be polynomial in , in contrast to the much larger class of *non-simple* prime geodesics, whose asymptotics are exponential in (the “prime number theorem for geodesics”, developed in a classic series of works by Delsart, Huber, Selberg, and Margulis); she also used this formula to establish a new proof of a conjecture of Witten on intersection numbers that was first proven by Kontsevich. More recently, in two lengthy papers with Eskin and with Eskin-Mohammadi, Mirzakhani established rigidity theorems for the action of on such moduli spaces that are close analogues of Ratner’s celebrated rigidity theorems for unipotently generated groups (discussed in this previous blog post). Ratner’s theorems are already notoriously difficult to prove, and rely very much on the polynomial stability properties of unipotent flows; in this even more complicated setting, the unipotent flows are no longer tractable, and Mirzakhani instead uses a recent “exponential drift” method of Benoist and Quint as a substitute. Ratner’s theorems are incredibly useful for all sorts of problems connected to homogeneous dynamics, and the analogous theorems established by Mirzakhani, Eskin, and Mohammadi have a similarly broad range of applications, for instance in counting periodic billiard trajectories in rational polygons.

Many fluid equations are expected to exhibit turbulence in their solutions, in which a significant portion of their energy ends up in high frequency modes. A typical example arises from the three-dimensional periodic Navier-Stokes equations

where is the velocity field, is a forcing term, is a pressure field, and is the viscosity. To study the dynamics of energy for this system, we first pass to the Fourier transform

so that the system becomes

We may normalise (and ) to have mean zero, so that . Then we introduce the dyadic energies

where ranges over the powers of two, and is shorthand for . Taking the inner product of (1) with , we obtain the energy flow equation

where range over powers of two, is the energy flow rate

is the energy dissipation rate

and is the energy injection rate

The Navier-Stokes equations are notoriously difficult to solve in general. Despite this, Kolmogorov in 1941 was able to give a convincing heuristic argument for what the distribution of the dyadic energies should become over long times, assuming that some sort of distributional steady state is reached. It is common to present this argument in the form of dimensional analysis, but one can also give a more “first principles” form Kolmogorov’s argument, which I will do here. Heuristically, one can divide the frequency scales into three regimes:

- The
*injection regime*in which the energy injection rate dominates the right-hand side of (2); - The
*energy flow regime*in which the flow rates dominate the right-hand side of (2); and - The
*dissipation regime*in which the dissipation dominates the right-hand side of (2).

If we assume a fairly steady and smooth forcing term , then will be supported on the low frequency modes , and so we heuristically expect the injection regime to consist of the low scales . Conversely, if we take the viscosity to be small, we expect the dissipation regime to only occur for very large frequencies , with the energy flow regime occupying the intermediate frequencies.

We can heuristically predict the dividing line between the energy flow regime. Of all the flow rates , it turns out in practice that the terms in which (i.e., interactions between comparable scales, rather than widely separated scales) will dominate the other flow rates, so we will focus just on these terms. It is convenient to return back to physical space, decomposing the velocity field into Littlewood-Paley components

of the velocity field at frequency . By Plancherel’s theorem, this field will have an norm of , and as a naive model of turbulence we expect this field to be spread out more or less uniformly on the torus, so we have the heuristic

and a similar heuristic applied to gives

(One can consider modifications of the Kolmogorov model in which is concentrated on a lower-dimensional subset of the three-dimensional torus, leading to some changes in the numerology below, but we will not consider such variants here.) Since

we thus arrive at the heuristic

Of course, there is the possibility that due to significant cancellation, the energy flow is significantly less than , but we will assume that cancellation effects are not that significant, so that we typically have

or (assuming that does not oscillate too much in , and are close to )

On the other hand, we clearly have

We thus expect to be in the dissipation regime when

and in the energy flow regime when

Now we study the energy flow regime further. We assume a “statistically scale-invariant” dynamics in this regime, in particular assuming a power law

for some . From (3), we then expect an average asymptotic of the form

for some structure constants that depend on the exact nature of the turbulence; here we have replaced the factor by the comparable term to make things more symmetric. In order to attain a steady state in the energy flow regime, we thus need a cancellation in the structure constants:

On the other hand, if one is assuming statistical scale invariance, we expect the structure constants to be scale-invariant (in the energy flow regime), in that

for dyadic . Also, since the Euler equations conserve energy, the energy flows symmetrise to zero,

which from (7) suggests a similar cancellation among the structure constants

Combining this with the scale-invariance (9), we see that for fixed , we may organise the structure constants for dyadic into sextuples which sum to zero (including some degenerate tuples of order less than six). This will *automatically* guarantee the cancellation (8) required for a steady state energy distribution, provided that

or in other words

for any other value of , there is no particular reason to expect this cancellation (8) to hold. Thus we are led to the heuristic conclusion that the most stable power law distribution for the energies is the law

or in terms of shell energies, we have the famous Kolmogorov 5/3 law

Given that frequency interactions tend to cascade from low frequencies to high (if only because there are so many more high frequencies than low ones), the above analysis predicts a stablising effect around this power law: scales at which a law (6) holds for some are likely to lose energy in the near-term, while scales at which a law (6) hold for some are conversely expected to gain energy, thus nudging the exponent of power law towards .

We can solve for in terms of energy dissipation as follows. If we let be the frequency scale demarcating the transition from the energy flow regime (5) to the dissipation regime (4), we have

and hence by (10)

On the other hand, if we let be the energy dissipation at this scale (which we expect to be the dominant scale of energy dissipation), we have

Some simple algebra then lets us solve for and as

and

Thus, we have the Kolmogorov prediction

for

with energy dissipation occuring at the high end of this scale, which is counterbalanced by the energy injection at the low end of the scale.

As in the previous post, all computations here are at the formal level only.

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with *Lagrangian space* and *Eulerian space* , we let be the space of volume-preserving, orientation-preserving maps from Lagrangian space to Eulerian space. Given a curve , we can define the *Lagrangian velocity field* as the time derivative of , and the *Eulerian velocity field* . The volume-preserving nature of ensures that is a divergence-free vector field:

If we formally define the functional

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

for some pressure field . As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

the rigid motion symmetries of Eulerian space give conservation of the total momentum

and total angular momentum

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

for any closed loop in , or equivalently pointwise conservation of the Lagrangian vorticity , where is the -form associated with the vector field using the Euclidean metric on , with denoting pullback by .

It turns out that one can generalise the above calculations. Given any self-adjoint operator on divergence-free vector fields , we can define the functional

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

for some pressure field , where in coordinates is with the usual summation conventions. (When , , and this term can be absorbed into the pressure , and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

If the operator commutes with rigid motions on , then we have conservation of total momentum

and total angular momentum

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

or pointwise conservation of the Lagrangian vorticity . These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions , when is the inverse derivative . The vorticity is a -form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write , then

Since is also divergence-free, we may therefore write

where the stream function is given by the formula

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As commutes with rigid motions, we also have (formally, at least) conservation of momentum

(which up to trivial transformations is also expressible in impulse form as , after integration by parts), and conservation of angular momentum

(which up to trivial transformations is ). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity , thus is transported by the flow (which is also evident from (3). In particular, all integrals of the form for a fixed function are conserved by the flow.

Mathematicians study a variety of different mathematical structures, but perhaps the structures that are most commonly associated with mathematics are the *number systems*, such as the integers or the real numbers . Indeed, the use of number systems is so closely identified with the practice of mathematics that one sometimes forgets that it is possible to do mathematics without explicit reference to any concept of number. For instance, the ancient Greeks were able to prove many theorems in Euclidean geometry, well before the development of Cartesian coordinates and analytic geometry in the seventeenth century, or the formal constructions or axiomatisations of the real number system that emerged in the nineteenth century (not to mention precursor concepts such as zero or negative numbers, whose very existence was highly controversial, if entertained at all, to the ancient Greeks). To do this, the Greeks used geometric operations as substitutes for the arithmetic operations that would be more familiar to modern mathematicians. For instance, concatenation of line segments or planar regions serves as a substitute for addition; the operation of forming a rectangle out of two line segments would serve as a substitute for multiplication; the concept of similarity can be used as a substitute for ratios or division; and so forth.

A similar situation exists in modern physics. Physical quantities such as length, mass, momentum, charge, and so forth are routinely measured and manipulated using the real number system (or related systems, such as if one wishes to measure a vector-valued physical quantity such as velocity). Much as analytic geometry allows one to use the laws of algebra and trigonometry to calculate and prove theorems in geometry, the identification of physical quantities with numbers allows one to express physical laws and relationships (such as Einstein’s famous mass-energy equivalence ) as algebraic (or differential) equations, which can then be solved and otherwise manipulated through the extensive mathematical toolbox that has been developed over the centuries to deal with such equations.

However, as any student of physics is aware, most physical quantities are not represented *purely* by one or more numbers, but instead by a combination of a number and some sort of *unit*. For instance, it would be a category error to assert that the length of some object was a number such as ; instead, one has to say something like “the length of this object is yards”, combining both a number and a unit (in this case, the yard). Changing the unit leads to a change in the numerical value assigned to this physical quantity, even though no physical change to the object being measured has occurred. For instance, if one decides to use feet as the unit of length instead of yards, then the length of the object is now feet; if one instead uses metres, the length is now metres; and so forth. But nothing physical has changed when performing this change of units, and these lengths are considered all equal to each other:

It is then common to declare that while physical quantities and units are not, strictly speaking, numbers, they should be manipulated using the laws of algebra *as if* they were numerical quantities. For instance, if an object travels metres in seconds, then its speed should be

where we use the usual abbreviations of and for metres and seconds respectively. Similarly, if the speed of light is and an object has mass , then Einstein’s mass-energy equivalence then tells us that the energy-content of this object is

Note that the symbols are being manipulated algebraically as if they were mathematical variables such as and . By collecting all these units together, we see that every physical quantity gets assigned a unit of a certain *dimension*: for instance, we see here that the energy of an object can be given the unit of (more commonly known as a Joule), which has the dimension of where are the dimensions of mass, length, and time respectively.

There is however one important limitation to the ability to manipulate “dimensionful” quantities as if they were numbers: one is not supposed to add, subtract, or compare two physical quantities if they have different dimensions, although it is acceptable to multiply or divide two such quantities. For instance, if is a mass (having the units ) and is a speed (having the units ), then it is physically “legitimate” to form an expression such as , but not an expression such as or ; in a similar spirit, statements such as or are physically meaningless. This combines well with the mathematical distinction between vector, scalar, and matrix quantities, which among other things prohibits one from adding together two such quantities if their vector or matrix type are different (e.g. one cannot add a scalar to a vector, or a vector to a matrix), and also places limitations on when two such quantities can be multiplied together. A related limitation, which is not always made explicit in physics texts, is that transcendental mathematical functions such as or should only be applied to arguments that are *dimensionless*; thus, for instance, if is a speed, then is not physically meaningful, but is (this particular quantity is known as the rapidity associated to this speed).

These limitations may seem like a weakness in the mathematical modeling of physical quantities; one may think that one could get a more “powerful” mathematical framework if one were allowed to perform dimensionally inconsistent operations, such as add together a mass and a velocity, add together a vector and a scalar, exponentiate a length, etc. Certainly there is some precedent for this in mathematics; for instance, the formalism of Clifford algebras does in fact allow one to (among other things) add vectors with scalars, and in differential geometry it is quite common to formally apply transcendental functions (such as the exponential function) to a differential form (for instance, the Liouville measure of a symplectic manifold can be usefully thought of as a component of the exponential of the symplectic form ).

However, there are several reasons why it is advantageous to retain the limitation to only perform dimensionally consistent operations. One is that of error correction: one can often catch (and correct for) errors in one’s calculations by discovering a dimensional inconsistency, and tracing it back to the first step where it occurs. Also, by performing dimensional analysis, one can often identify the form of a physical law before one has fully derived it. For instance, if one postulates the existence of a mass-energy relationship involving only the mass of an object , the energy content , and the speed of light , dimensional analysis is already sufficient to deduce that the relationship must be of the form for some dimensionless absolute constant ; the only remaining task is then to work out the constant of proportionality , which requires physical arguments beyond that provided by dimensional analysis. (This is a simple instance of a more general application of dimensional analysis known as the Buckingham theorem.)

The use of units and dimensional analysis has certainly been proven to be very effective tools in physics. But one can pose the question of whether it has a properly grounded mathematical foundation, in order to settle any lingering unease about using such tools in physics, and also in order to rigorously develop such tools for purely mathematical purposes (such as analysing identities and inequalities in such fields of mathematics as harmonic analysis or partial differential equations).

The example of Euclidean geometry mentioned previously offers one possible approach to formalising the use of dimensions. For instance, one could model the length of a line segment not by a number, but rather by the equivalence class of all line segments congruent to the original line segment (cf. the Frege-Russell definition of a number). Similarly, the area of a planar region can be modeled not by a number, but by the equivalence class of all regions that are equidecomposable with the original region (one can, if one wishes, restrict attention here to measurable sets in order to avoid Banach-Tarski-type paradoxes, though that particular paradox actually only arises in three and higher dimensions). As mentioned before, it is then geometrically natural to multiply two lengths to form an area, by taking a rectangle whose line segments have the stated lengths, and using the area of that rectangle as a product. This geometric picture works well for units such as length and volume that have a spatial geometric interpretation, but it is less clear how to apply it for more general units. For instance, it does not seem geometrically natural (or, for that matter, conceptually helpful) to envision the equation as the assertion that the energy is the volume of a rectangular box whose height is the mass and whose length and width is given by the speed of light .

But there are at least two other ways to formalise dimensionful quantities in mathematics, which I will discuss below the fold. The first is a “parametric” model in which dimensionful objects are modeled as numbers (or vectors, matrices, etc.) depending on some base dimensional parameters (such as units of length, mass, and time, or perhaps a coordinate system for space or spacetime), and transforming according to some representation of a *structure group* that encodes the range of these parameters; this type of “coordinate-heavy” model is often used (either implicitly or explicitly) by physicists in order to efficiently perform calculations, particularly when manipulating vector or tensor-valued quantities. The second is an “abstract” model in which dimensionful objects now live in an abstract mathematical space (e.g. an abstract vector space), in which only a subset of the operations available to general-purpose number systems such as or are available, namely those operations which are “dimensionally consistent” or invariant (or more precisely, equivariant) with respect to the action of the underlying structure group. This sort of “coordinate-free” approach tends to be the one which is preferred by pure mathematicians, particularly in the various branches of modern geometry, in part because it can lead to greater conceptual clarity, as well as results of great generality; it is also close to the more informal practice of treating mathematical manipulations that do not preserve dimensional consistency as being physically meaningless.

Things are pretty quiet here during the holiday season, but one small thing I have been working on recently is a set of notes on special relativity that I will be working through in a few weeks with some bright high school students here at our local math circle. I have only two hours to spend with this group, and it is unlikely that we will reach the end of the notes (in which I derive the famous mass-energy equivalence relation E=mc^2, largely following Einstein’s original derivation as discussed in this previous blog post); instead we will probably spend a fair chunk of time on related topics which do not actually require special relativity *per se*, such as spacetime diagrams, the Doppler shift effect, and an analysis of my airport puzzle. This will be my first time doing something of this sort (in which I will be spending as much time interacting directly with the students as I would lecturing); I’m not sure exactly how it will play out, being a little outside of my usual comfort zone of undergraduate and graduate teaching, but am looking forward to finding out how it goes. (In particular, it may end up that the discussion deviates somewhat from my prepared notes.)

The material covered in my notes is certainly not new, but I ultimately decided that it was worth putting up here in case some readers here had any corrections or other feedback to contribute (which, as always, would be greatly appreciated).

*[Dec 24 and then Jan 21: notes updated, in response to comments.]*

Way back in 2007, I wrote a blog post giving Einstein’s derivation of his famous equation for the rest energy of a body with mass . (Throughout this post, mass is used to refer to the invariant mass (also known as *rest mass*) of an object.) This derivation used a number of physical assumptions, including the following:

- The two postulates of special relativity: firstly, that the laws of physics are the same in every inertial reference frame, and secondly that the speed of light in vacuum is equal in every such inertial frame.
- Planck’s relation and de Broglie’s law for photons, relating the frequency, energy, and momentum of such photons together.
- The law of conservation of energy, and the law of conservation of momentum, as well as the additivity of these quantities (i.e. the energy of a system is the sum of the energy of its components, and similarly for momentum).
- The Newtonian approximations , to energy and momentum at low velocities.

The argument was one-dimensional in nature, in the sense that only one of the three spatial dimensions was actually used in the proof.

As was pointed out in comments in the previous post by Laurens Gunnarsen, this derivation has the curious feature of needing some laws from quantum mechanics (specifically, the Planck and de Broglie laws) in order to derive an equation in special relativity (which does not ostensibly require quantum mechanics). One can then ask whether one can give a derivation that does not require such laws. As pointed out in previous comments, one can use the representation theory of the Lorentz group to give a nice derivation that avoids any quantum mechanics, but it now needs at least two spatial dimensions instead of just one. I decided to work out this derivation in a way that does not explicitly use representation theory (although it is certainly lurking beneath the surface). The concept of momentum is only barely used in this derivation, and the main ingredients are now reduced to the following:

- The two postulates of special relativity;
- The law of conservation of energy (and the additivity of energy);
- The Newtonian approximation at low velocities.

The argument (which uses a little bit of calculus, but is otherwise elementary) is given below the fold. Whereas Einstein’s original argument considers a mass emitting two photons in several different reference frames, the argument here considers a large mass breaking up into two equal smaller masses. Viewing this situation in different reference frames gives a functional equation for the relationship between energy, mass, and velocity, which can then be solved using some calculus, using the Newtonian approximation as a boundary condition, to give the famous formula.

*Disclaimer:* As with the previous post, the arguments here are physical arguments rather than purely mathematical ones, and thus do not really qualify as a rigorous mathematical argument, due to the implicit use of a number of physical and metaphysical hypotheses beyond the ones explicitly listed above. (But it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on rigorous mathematical reasoning.)

## Recent Comments