Einstein’s derivation of E=mc^2 revisited

2 October, 2012 in expository, math.MP, math.RT | Tags: Albert Einstein, mass-energy equivalence, special relativity | by Terence Tao

Way back in 2007, I wrote a blog post giving Einstein’s derivation of his famous equation ${E=mc^2}$ for the rest energy of a body with mass ${m}$ . (Throughout this post, mass is used to refer to the invariant mass (also known as rest mass) of an object.) This derivation used a number of physical assumptions, including the following:

The two postulates of special relativity: firstly, that the laws of physics are the same in every inertial reference frame, and secondly that the speed of light in vacuum is equal ${c}$ in every such inertial frame.
Planck’s relation and de Broglie’s law for photons, relating the frequency, energy, and momentum of such photons together.
The law of conservation of energy, and the law of conservation of momentum, as well as the additivity of these quantities (i.e. the energy of a system is the sum of the energy of its components, and similarly for momentum).
The Newtonian approximations ${E \approx E_0 + \frac{1}{2} m|v|^2}$ , ${p \approx m v}$ to energy and momentum at low velocities.

The argument was one-dimensional in nature, in the sense that only one of the three spatial dimensions was actually used in the proof.

As was pointed out in comments in the previous post by Laurens Gunnarsen, this derivation has the curious feature of needing some laws from quantum mechanics (specifically, the Planck and de Broglie laws) in order to derive an equation in special relativity (which does not ostensibly require quantum mechanics). One can then ask whether one can give a derivation that does not require such laws. As pointed out in previous comments, one can use the representation theory of the Lorentz group ${SO(d,1)}$ to give a nice derivation that avoids any quantum mechanics, but it now needs at least two spatial dimensions instead of just one. I decided to work out this derivation in a way that does not explicitly use representation theory (although it is certainly lurking beneath the surface). The concept of momentum is only barely used in this derivation, and the main ingredients are now reduced to the following:

The two postulates of special relativity;
The law of conservation of energy (and the additivity of energy);
The Newtonian approximation ${E \approx E_0 + \frac{1}{2} m|v|^2}$ at low velocities.

The argument (which uses a little bit of calculus, but is otherwise elementary) is given below the fold. Whereas Einstein’s original argument considers a mass emitting two photons in several different reference frames, the argument here considers a large mass breaking up into two equal smaller masses. Viewing this situation in different reference frames gives a functional equation for the relationship between energy, mass, and velocity, which can then be solved using some calculus, using the Newtonian approximation as a boundary condition, to give the famous ${E=mc^2}$ formula.

Disclaimer: As with the previous post, the arguments here are physical arguments rather than purely mathematical ones, and thus do not really qualify as a rigorous mathematical argument, due to the implicit use of a number of physical and metaphysical hypotheses beyond the ones explicitly listed above. (But it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on ${100\%}$ rigorous mathematical reasoning.)

— 1. The main argument —

We will assume that the total energy ${E}$ of a moving body depends only on the mass ${m}$ of that body, and the velocity ${v}$ of that body:

$\displaystyle E = E(m,v).$

(This is actually a non-trivial assumption; it excludes the possibility that the energy might also be depenent on other features of the body, such as spin or charge.) At present, this functional relationship ${E: (m,v) \mapsto E(m,v)}$ is arbitrary. However, we can use some physical arguments to constrain this relationship. We first use the following argument of Galileo. Consider two bodies side by side, traveling at the same velocity ${v}$ , with the first body of mass ${m_1}$ and the second of mass ${m_2}$ . Then, the first body has energy ${E(m_1,v)}$ and the second has energy ${E(m_2,v)}$ , so the combined system of two bodies has total energy ${E(m_1,v)+E(m_2,v)}$ . On the other hand, if we imagine connecting the two bodies by an infinitesimally thin thread, we can view the system as a single body of mass ${m_1+m_2}$ traveling at the same velocity ${v}$ . This leads us to the relationship

$\displaystyle E(m_1,v) + E(m_2,v) = E(m_1+m_2,v)$

for any ${m_1,m_2,v}$ , which (under reasonable hypotheses of continuity) implies a linear relationship between energy and mass, thus

$\displaystyle E(m,v) = m f(v)$

for some function ${f: v \mapsto f(v)}$ depending only on the velocity ${v}$ .

We still have to determine this unknown functional relationship ${f: v \mapsto f(v)}$ . We assume rotational symmetry of the laws of physics (which one can view as a special case of the first postulate of special relativity): if two bodies of equal mass move at the same speed, but at different directions, the energies should be the same. In other words, ${f}$ should be spherically symmetric, so by abuse of notation we write

$\displaystyle f(v) = f(|v|). \ \ \ \ \ (1)$

Now consider a body ${B}$ of mass ${M}$ at rest at the origin(in some reference frame ${O}$ ), which somehow disintegrates (at time ${t=0}$ , for simplicity) into two smaller bodies ${B_+, B_-}$ of equal mass ${m}$ , one moving in the positive ${x}$ direction at some velocity ${(+v,0,0)}$ , and the other moving in the negative ${x}$ direction at the opposite velocity ${(-v,0,0)}$ (note that this situation is consistent with the law of conservation of momentum). (If one prefers, one could also view the time-reversed situation, in which two masses of equal and opposite velocity collide to form a large stationary mass; the analysis of this situation is basically identical to the one given here.) In Newtonian mechanics, we have conservation and additivity of mass, so that ${M}$ must equal ${m+m}$ ; but we will not assume conservation and additivity of mass here (and in fact at least one of these laws must break down in special relativity, at least if one insists on using an invariant notion of mass). Instead, we can link ${M}$ , ${m}$ , and ${v}$ to each other by the law of conservation of energy. Before the disintegration, the body ${B}$ has total energy ${E(M,0) = M f(0)}$ , while after the disintegration the system ${B_++B_-}$ has total energy ${E(m,(+v,0,0)) + E(m,(-v,0,0)) = 2 m f(|v|)}$ (using the spherically symmetric nature (1)) of ${f}$ , and so

$\displaystyle M f(0) = 2m f(v). \ \ \ \ \ (2)$

Now we view the same system relative to another reference frame ${O'}$ , which relative to ${O}$ is moving at a velocity ${(w,0,0)}$ in the ${x}$ direction for some ${w}$ , while keeping the ${y}$ and ${z}$ coordinates unchanged. The spacetime coordinates ${(x',y',z',t')}$ of ${O'}$ are then related to those ${(x,y,z,t)}$ of ${O}$ by the usual Lorentz transformations

$\displaystyle x' = \frac{x-wt}{\sqrt{1-w^2/c^2}}$

$\displaystyle y' = y$

$\displaystyle z' = z$

$\displaystyle t' = \frac{t-wx/c^2}{\sqrt{1-w^2/c^2}}$

which can be deduced from the postulates of special relativity by a standard derivation that we will not give here (it is sketched in the previous blog post). The pre-disintegration body ${B}$ is moving along the worldline ${\{ (0,0,0,t): t < 0\}}$ in the ${O}$ reference frame, and is thus moving along the line ${\{ (\frac{-wt}{\sqrt{1-w^2/c^2}}, 0, 0, \frac{t}{\sqrt{1-w^2/c^2}}): t < 0 \}}$ in the ${O'}$ reference frame; in particular, it has velocity ${-w}$ in this frame and thus has energy ${M f(w)}$ in this frame.

Now consider the first post-disintegration body ${B_+}$ . It is moving along the worldline ${\{ (vt, 0,0,t): t>0 \}}$ in the ${O}$ reference frame, and thus along the line ${\{ (\frac{vt-wt}{\sqrt{1-w^2/c^2}}, 0, 0, \frac{t-vwt/c^2}{\sqrt{1-w^2/c^2}}): t > 0 \}}$ in the ${O'}$ reference frame; in particular, the speed of ${B_+}$ in this frame is ${\frac{v-w}{1-vw/c^2}}$ (the well known velocity addition (or subtraction) formula), and so the energy of this body is ${m f( \frac{v-w}{1-vw/c^2} )}$ . Similarly, ${B_-}$ has energy ${m f( \frac{v+w}{1+vw/c^2} )}$ . Equating energies, we are thus led to

$\displaystyle M f(w) = m (f( \frac{v-w}{1-vw/c^2} ) + f(\frac{v+w}{1+vw/c^2})).$

We can eliminate ${M,m}$ using (2), to obtain a functional equation for ${f}$ :

$\displaystyle 2f(v) f(w) = f(0) (f( \frac{v-w}{1-vw/c^2} ) + f(\frac{v+w}{1+vw/c^2})). \ \ \ \ \ (3)$

This equation should hold for all (physically attainable) velocities ${v,w}$ . To solve this equation, it is convenient to work with the change of variables

$\displaystyle v = c \tanh \alpha; w = c \tanh \beta;$

the hyperbolic angles ${\alpha,\beta}$ are known as the rapiditiesassociated to ${v}$ and ${w}$ respectively. The point of using this change of variables is that the hyperbolic tangent addition formula yields

$\displaystyle \frac{v+w}{1+vw/c^2} = c \tanh(\alpha+\beta); \quad \frac{v-w}{1-vw/c^2} = c \tanh(\alpha-\beta).$

Thus if we make the change of variables

$\displaystyle g(\alpha) := f( c \tanh \alpha )$

then (3)simplifies to

$\displaystyle 2 g(\alpha) g(\beta) = g(0) ( g(\alpha+\beta) + g(\alpha-\beta) ).$

It is tempting to plug in some special values into this equation, such as ${\beta=0}$ , but this only gives a trivial equation. However, if we first differentiate twice in ${\beta}$ to obtain

$\displaystyle 2 g(\alpha) g''(\beta) = g(0) ( g''(\alpha+\beta) + g''(\alpha-\beta) )$

and thenset ${\beta=0}$ , we obtain the non-trivial equation

$\displaystyle g(\alpha) g''(0) = g(0) g''(\alpha).$

This is a differential equation in ${g}$ , and can be solved as

$\displaystyle g(\alpha) = A \cosh(k \alpha) + B \sinh(k \alpha)$

for some unknowns ${A,B}$ , where ${k}$ is the square root of ${g''(0)/g(0)}$ . From (1), ${g}$ should have vanishing derivative at the origin, and so ${B=0}$ , and so we have

$\displaystyle f( c \tanh \alpha ) = A \cosh(k \alpha). \ \ \ \ \ (4)$

This is significant progress in constraining the behaviour of ${f}$ , but there are still two unknown parameters ${A, k}$ . To proceed further, it becomes necessary to utilise a second dimension. Namely, we repeat the previous arguments, but with ${O'}$ now moving at velocity ${(0,w,0)}$ instead of ${(w,0,0)}$ . The Lorentz transformations are now

$\displaystyle x' = x$

$\displaystyle y' = \frac{y-wt}{\sqrt{1-w^2/c^2}}$

$\displaystyle z' = z$

$\displaystyle t' = \frac{t-wy/c^2}{\sqrt{1-w^2/c^2}}.$

The pre-disintegration body ${B}$ is moving along the worldline ${\{ (0,0,0,t): t < 0\}}$ in the ${O}$ reference frame, and is thus moving along the line ${\{ (0, \frac{-wt}{\sqrt{1-w^2/c^2}}, 0, \frac{t}{\sqrt{1-w^2/c^2}}): t < 0 \}}$ in the ${O'}$ reference frame; in particular, it has velocity ${-w}$ in this frame and thus has energy ${M f(w)}$ in this frame.

Now consider the first post-disintegration body ${B_+}$ . It is moving along the worldline ${\{ (vt, 0,0,t): t>0 \}}$ in the ${O}$ reference frame, and thus along the line ${\{ (vt, \frac{-wt}{\sqrt{1-w^2/c^2}}, 0, \frac{t}{\sqrt{1-w^2/c^2}}): t > 0 \}}$ in the ${O'}$ reference frame; in particular, the speed of ${B_+}$ in this frame is ${\sqrt{ v^2 (1-w^2/c^2) + w^2 }}$ , and so the energy of this body is ${m f( \sqrt{ v^2 (1-w^2/c^2) + w^2 } )}$ . Similarly for ${B_-}$ . Equating energies, we are thus led to

$\displaystyle M f(w) = 2m f( \sqrt{ v^2 (1-w^2/c^2) + w^2 } ).$

We can eliminate ${M,m}$ using (2), to obtain a functional equation for ${f}$ :

$\displaystyle f(v) f(w) = f(0) f( \sqrt{ v^2 (1-w^2/c^2) + w^2 } ).$

This equation should hold for all (physically attainable) velocities ${v,w}$ . To solve this equation, we work with infinitesimal ${w}$ and perform a Taylor expansion. From the symmetry (2), ${f}$ should be flat at the origin, and so (assuming sufficient smoothness for ${f}$ ) we have

$\displaystyle f(w) = f(0) + \frac{1}{2} f''(0) w^2 + o(w^2),$

while from the Taylor approximation

$\displaystyle \sqrt{ v^2 (1-w^2/c^2) + w^2 } = v + \frac{1-v^2/c^2}{2v} w^2$

we have

$\displaystyle f( \sqrt{ v^2 (1-w^2/c^2) + w^2 } ) = f(v) + \frac{1-v^2/c^2}{2v} w^2 f'(v) + o(w^2).$

Inserting these expansions and extracting the ${w^2}$ coefficient, we obtain the differential equation

$\displaystyle f(v) f''(0) = f(0) \frac{1-v^2/c^2}{2v} f'(v).$

which we can rewrite as

$\displaystyle \frac{d}{dv} \log f(v) = C \frac{v}{1-v^2/c^2}$

for some constant ${C}$ . We can integrate this as

$\displaystyle \log f(v) = -C \log(1-v^2/c^2) + C'$

and thus

$\displaystyle f(v) = D (1-v^2/c^2)^{-C}$

for some parameters ${C, D}$ . In rapidity coordinates ${v = c \tanh \alpha}$ , this becomes

$\displaystyle f( c \tanh \alpha) = D \cosh^{2C} \alpha.$

Comparing this with (4)(e.g. by performing a Taylor expansion to fourth order around ${\alpha=0}$ ) we see that ${k = 2C = 1}$ , thus

$\displaystyle f(c \tanh \alpha) = A \cosh \alpha$

or equivalently

$\displaystyle f(v) = \frac{A}{\sqrt{1-v^2/c^2}}.$

Thus we have

$\displaystyle E(m,v) = \frac{Am}{\sqrt{1-|v|^2/c^2}}.$

For infinitesimal velocities ${v}$ , we may Taylor expand

$\displaystyle E(m,v) = A m + \frac{1}{2} \frac{A}{c^2} m|v|^2 + o(|v|^2)$

and so the kinetic energy of a slowly moving mass is ${\frac{1}{2} \frac{A}{c^2} m|v|^2 + o(|v|^2)}$ . Comparing this with the Newtonian approximation of ${\frac{1}{2} m|v|^2}$ we conclude that ${A=c^2}$ , and thus

$\displaystyle E(m,v) = \frac{mc^2}{\sqrt{1-|v|^2/c^2}}. \ \ \ \ \ (5)$

In particular, setting ${v=0}$ we see that the rest energy ${E(m,0)}$ of a body of mass ${m}$ is ${mc^2}$ , as required.

Remark 1 The above derivation did not explicitly use the law of conservation of momentum (other than to observe that the scenario of one mass at rest splitting into two smaller masses moving in equal and opposite directions was compatible with this law). Actually, if one definesthe momentum ${p(m,v)}$ of a body of mass ${m}$ and velocity ${v}$ by the formula

$\displaystyle p(m,v) := \frac{mv}{\sqrt{1-|v|^2/c^2}}$

and the momentum of a system as the sum of the momenta of its components, one can use (5) and the Lorentz transformations to (after some algebra) express the total momentum of a system as a linear combination of the total energy of that system viewed in a couple reference frames (or, if one prefers, as the derivatives of the total energy with respect to infinitesimal reference frame changes), and as a consequence one can actually derive the law of conservation of momentum from the law of conservation of energy, together with special relativity. (Actually, this can also be done in Galilean relativity as well, using the classical formula ${E(m,v)=E(m,0) + \frac{1}{2} m |v|^2}$ ; we leave this as an exercise to the reader.) Indeed, in special relativity it is natural to unify energy and momentum together as a single quantity known as the four-momentum.

Remark 2 The above arguments ultimately rely on the fact that the Lorentz group ${SO(d,1)}$ has an essentially unique linear action on ${{\bf R}^{1+d}}$ when the spatial dimension ${d}$ is at least two. For ${d=1}$ , the group ${SO(d,1)}$ becomes abelian, and there is a multiplicity of such actions (parameterised by the different possibilities for the quantity ${k}$ appearing in (4)), and one could a priorihave a number of different laws relating energy and momentum with mass and velocity that are consistent with special relativity and the conservation laws. Indeed, for any choice of ${k > 0}$ , one could postulate the laws

$\displaystyle E(m,c \tanh \alpha) = \frac{m c^2}{k^2} \cosh(k \alpha)$

and

$\displaystyle p(m, c \tanh \alpha) = \frac{mc}{k} \sinh(k \alpha)$

for the energy and momentum of a body of mass ${m}$ moving at rapidity ${\alpha}$ (i.e. at velocity ${c\tanh \alpha}$ ). One can verify that such laws are consistent with the laws of conservation of mass and energy, with the postulates of special relativity, and with the Newtonian approximation, as long as one is only in one spatial dimension; one needs to use at least one other dimension to be able to reduce to the ${k=1}$ case. Thus we see that higher-dimensional relativity is more rigid than one-dimensional relativity. In the case of Einstein’s original argument, the quantum mechanical properties of photons are used instead to show that ${E/|p| \rightarrow c}$ in the lightspeed limit ${\alpha \rightarrow \infty}$ , which gives the reduction to ${k=1}$ .

51 comments

Comments feed for this article

2 October, 2012 at 8:07 pm

Anonymous

Thank you Prof. Tao. It is a great post.

2 October, 2012 at 9:39 pm

Anonymous

isn’t m missing in the Taylor’s expression for E(m,v)?

[Corrected, thanks – T.]

3 October, 2012 at 12:27 am

mircea

Is Galileo’s reasoning about linearity of energy in mass ok with special relativity? In which reference frame do we compare the velocities?

3 October, 2012 at 1:41 pm

Terence Tao

Galileo’s argument only requires a single frame of reference. (But thanks to the first postulate of special relativity, the relationship between energy, mass, and velocity should be the same in all reference frames.)

EDIT: Actually, on thinking about it a bit more, Galileo’s argument also needs the fact that the combined mass of two bodies traveling together at the same velocity is the sum of the masses of each individual mass. By working in a comoving reference frame, it suffices to assume this when both bodies are at rest, which is a plausible enough assumption, but it does require the use of the additional reference frame.

3 October, 2012 at 2:25 am

Bo Jacoby

If a is a constant and x is a variable the function y=ax is conventionally not written y=xa. How come that E=mc^2 is not written E=c^2m ?

3 October, 2012 at 3:38 am

Instead of using the one-dimensional velocity-addition (or subtraction) formula on two separate parts of the proof, what would happen if the two-dimensional formula is used just once. Would that work or would it just complicate things and not even give you two conditions ?

$\mathbf{v} \oplus\mathbf{u}=\frac{\mathbf{v}+\mathbf{u}_{\parallel} + \alpha_{\mathbf{v}}\mathbf{u}_{\perp}}{1+\frac{\mathbf{v}\cdot\mathbf{u}}{c^2}}=\frac{1}{1+\frac{\mathbf{v}\cdot\mathbf{u}}{c^2}}\left\{\mathbf{v}+\frac{1}{\gamma_\mathbf{v}}\mathbf{u}+\frac{1}{c^2}\frac{\gamma_\mathbf{v}}{1+\gamma_\mathbf{v}}(\mathbf{v}\cdot\mathbf{u})\mathbf{v}\right\}$

3 October, 2012 at 7:58 am

Terence Tao

Yes, this approach would work too (and ultimately leads back to understanding the representation theory of SO(3,1), which I alluded to earlier).

3 October, 2012 at 3:42 am

Another blog with special relativity stuff is:
http://thespectrumofriemannium.wordpress.com/

3 October, 2012 at 3:52 am

Number8

In Newtonian mechanics, we have conservation of mass, so that M must equal m+m ; but we will not assume conservation of mass here (and in fact this law turns out to be false in special relativity). Its still true for special relativity if the system is closed, The energy given out in a collision still has mass w.r.t to the systems frame even if its in the form of photons.

3 October, 2012 at 7:57 am

Terence Tao

Ah, I realised that “conservation of mass” is a bit ambiguous. It’s really the conjunction of “conservation of mass” (total mass of a system remains constant), “additivity of mass” (mass of a system equals the sum of the masses of the components), and “invariance of mass” (mass is the same in every reference frame) which cannot simultaneously hold in special relativity. Depending on how exactly one defines mass, one can make two of these three properties true in SR, but not all three at once. I’ve reworded the text to reflect this. (Also I now realise, for similar reasons, that additivity of energy is implicitly being relied on rather crucially in the discussion.)

4 September, 2023 at 6:06 am

Douglas J. Callahan

Hypothesis on Energy

E=NC2

Energy equals Neutrinos times Velocity Squared

Energy never dissipates.

E=MC2 is a fallacy theory because the Neutrinos are the energy. How could Energy equal Mass without another factor?

The reason that hyper space travel is possible is because Neutrinos are the energy. Dark Matter is space with the absence of Neutrinos. The Neutrinos move freely because they are the Energy.

One Scientist was able to “catch” or isolate 3 Neutrinos. Those that travel among and between galaxies have learned to harvest neutrinos and pierce Dark Matter with them.

Douglas J. Callahan
American Scientist
djcall242@gmail.com
September 4, 2023

3 October, 2012 at 5:20 am

philh

Typo? In the equation just before (3), Mf(w) = 2m(f(…) + f(…)), I don’t think the 2 should be there.

[Corrected, thanks – T.]

3 October, 2012 at 6:36 am

updog

Nice, but I’m not going to pretend to understand any of that at 7am.

3 October, 2012 at 6:39 am

quantummoxie

Tom Moore has a similar derivation in his fantastic textbook, Six Ideas That Shaped Physics, Unit R: The Laws of Physics Are Frame-Independent, Second Edition (McGraw-Hill, 2003). It’s a bit simpler than yours, but maybe slightly more heuristic.

3 October, 2012 at 7:08 am

Marco M

I have somewhat similar derivation in section 11 of my paper http://arxiv.org/abs/physics/0605204. There I do not use the Planck energy frequency relation, but use the relation between energy and momentum in a light wave, which was known before the theory of relativity or quantum mechanics.

16 October, 2012 at 7:04 am

mmoriconi

oops, sorry about this comment. I had no intention to promote any results. Just read the “about” section. My apologies if the post was inappropriate…

3 October, 2012 at 9:33 am

Hager El-Boghdady (@H_Sayed_M)

It is a great post …

3 October, 2012 at 1:18 pm

teacher

You lost me right at equation (2). Why would we assume that the energies before and after the “disintegration” are equal in your scenario? You say that a body B of mass M “somehow disintegrates”… but in real life, massive bodies don’t split into two pieces flying off in equal and opposite directions without an energy input — some dynamite, perhaps. (Similarly, if you reverse time and have the two bodies collide, the energy of their motion will dissipate in one way or another, heat and light and sound, say.) So I’m not getting the physical intuition here that would lead me to equate the left and right sides in (2). Am I being too literal here, or missing some subtlety?

3 October, 2012 at 1:33 pm

Terence Tao

Well, the body can certainly contain within it some energy source (like the dynamite you mention, which contains energy and hence mass within its chemical bonds), and the total mass of that body will then have to take that energy into consideration. In the context of two bodies colliding, the new body might acquire some additional energy (through heat, etc.), and again this would be reflected in the total mass of the new body.

Note at the elementary particle level that it is quite common for one particle to spontaneously disintegrate into two or more particles (e.g. a neutron can decay into a proton, electron, and neutrino, a radioactive nucleus can decay into a smaller nucleus and an alpha particle, and so forth). Implicit in the above analysis is the assumption that there are enough interactions of this form to span enough physically realisable values of mass, velocity, etc. that one can safely extrapolate to arbitrary values of these parameters. (By the way, the assumption that the initial body splits into two bodies of equal mass is mostly for mathematical convenience; a similar argument would also work for more complicated collisions in which some number of bodies of various masses came together and formed some other number of bodies, again of various masses, moving with different velocities, but the mathematical analysis of such collisions becomes messier.)

3 October, 2012 at 2:27 pm

teacher

Thanks for the clarification! I guess the problem is that I’m not automatically taking into account the equivalence of mass and energy – I mean, I understand it abstractly well enough (in a limited, undergraduate-physics sense), but it’s not intuitive to me when I’m picturing concrete physical objects, as in your scenario. (I know, of course, that it can be measured concretely under the right experimental conditions — e.g. the gases released by the dynamite, cooled down and collected, would weigh ever so slightly less than the dynamite itself — but the effect is so far from perceptible under ordinary conditions that it doesn’t translate into real intuition.)

I think that the problem is that I was looking to your derivation of the relationship between mass and energy precisely in order to *supply* some of that intuition. And I guess to do so would be a bit circular, since in order to accept equation (2), I need to understand that the total mass of the object must take into account the potential energy of the “dynamite,” which means that I have to accept mass-energy equivalence in the first place. Which I suppose is OK — you’re not trying to prove mass-energy equivalence, just to quantify it (E=mc^2) — is that a reasonable assessment?

3 October, 2012 at 2:49 pm

Terence Tao

Yes, by assuming the functional relationship E = E(m,v), one is implicitly assuming some sort of mass-energy equivalence, which one is then trying to quantify. As mentioned at the start of the derivation, this is a non-trivial assumption. (But given that kinetic energy, at least, depends only on mass and velocity in Newtonian mechanics, it is not too unreasonable of a hypothesis.)

At the elementary particle level, this sort of assumption becomes more plausible because the particles have almost no internal structure which could conceivably supply a non-mass source of energy.

3 October, 2012 at 6:14 pm

teacher

Gotcha! After reading the whole derivation through a couple of times I had a moment of sudden enlightenment – in which I, all at once, thoroughly understood, and more importantly felt intuition for, the whole story. Thanks for this post and for engaging with me here in the comments – I really learned something.

4 October, 2012 at 5:38 am

Wednesday/Thursday Highlights | Pseudo-Polymath

[…] Einsteins most famous equation (relation?) revisited. […]

4 October, 2012 at 5:40 am

Stones Cry Out - If they keep silent… » Things Heard: e240v3n4

[…] Einsteins most famous equation (relation?) revisited. […]

4 October, 2012 at 7:39 pm

Aleksey

In the part where you say that “The pre-disintegration body B is moving along the worldline in the O reference frame, and is thus moving along the line in the O’ reference frame” You are missing a 0 in the for one of the spacial coordinates.

[Corrected, thanks – T.]

5 October, 2012 at 1:25 am

Anonymous

Referring to second volume of Course of Theoretical Physics by Landau and Lifshitz, one can first find action principle for special relativity. Action must be invariant under lorentz transformations and integrand must be a differential of first order, so the integrand should be A*ds where A is a constant and ds = sqrt(c^2(dt)^2 – (dx)^2- (dy)^2 – (dz)^2). Limit in the low velocity should give the action for newton mechanics so A can be found. Then p = dL/dv and E = p.v – L .

5 October, 2012 at 3:10 am

chorasimilarity

You take the soul out of the craft. On the other side Landau & Lifshitz derivation is clear and physical (i.e. frame independent, like in geometry).

6 October, 2012 at 6:09 pm

Weekly Science Picks | Australian Science

[…] We all know the equation. Perhaps a lot less can actually tell you what it means. And then there are those special few that can decipher it, mathematically speaking, that is. Terence Tao, a name befitting of a mathematician, revisits the famous equation. […]

13 October, 2012 at 7:37 am

tubinhpham

It’s great. But $E=mc^2$ remains forever true and beautiful.
Tkank Prof T. Tao

13 October, 2012 at 6:28 pm

Sujit Nair

In Remark 1, you state that “…one can actually derive the law of conservation of momentum from the law of conservation of energy, together with special relativity.” I always think of conservation laws as a consequence of symmetry + action principle + Noether’s Theorem. I am curious to know if what you state in this remark is equivalent to the Noether approach.

14 October, 2012 at 7:54 am

Terence Tao

When the least action principle is one of the laws of physics, yes; the fact that momentum conservation can be derived from energy conservation and Lorentz invariance is equivalent (via Noether’s theorem) to the fact that spatial translation symmetry can be derived from time translation symmetry and Lorentz invariance. But one could conceivably imagine alternate laws of physics in which there was no action principle, but for which Lorentz invariance and conservation of energy/momentum were still valid (but coming from some other source other than minimisation of a Lagrangian).

14 October, 2012 at 6:46 am

John Jiang

This really fills in the missing part of Einstein’s layman treatment of SR, where he derives all the Lorentz spatial time transformations like a charm but fell short of giving a convincing (or at least accessible) argument for the energy mass relation. I like the mention of rigidity of SO(1,2) representations, but wonder if anything special occurs in d =3. I remember reading an article in scientific American years ago on that we are actually living in 2-d, like a holograph. Maybe there is some evidence in this derivation?
It also seems to me all the non-rigorous part of the derivation attributes to smoothness or continuity assumptions. Is that correct? I wonder if there is a formalism that clarifies the relation between regularity assumptions and physics derivation of the kind above.

15 October, 2012 at 1:56 am

E.L. Wisty

Reblogged this on Pink Iguana.

16 November, 2012 at 2:16 am

Anonymous

I got it…

20 November, 2012 at 8:42 pm

Bob

You say that “it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on 100% rigorous mathematical reasoning”.

I think, in fact, that even tautological statements could not be asserted, for in the absence of any information about the physical world, we can’t assume that we can even use logic (of any kind) and rules of inference to say anything about the world. It could very well be that there are worlds where completely different logical systems apply (or worlds where no such systems apply at all), and where tautologies deduced using rules of reasoning appropriate to our particular world would not apply.

14 November, 2014 at 6:35 pm

Anonymous

You are all mislead by not knowing what is the elements of matter called mass …? secondly what are the subtle elements that also matter which is unseen by naked eyes ………how can that be measured if no one knows ?haha

14 January, 2016 at 5:14 am

Anonymous

Your arguments can be made more precise by claiming that the very basic concepts of mass, energy and elementary particles (including the differences between real and virtual ones) still don’t have sufficiently rigorous mathematical definitions to allow any definite conclusion.

13 January, 2016 at 10:38 pm

Parsing Ajay Sharma v. E = mc2 – gaplogs

[…] this equation, Einstein was aware that it was held up by multiple approximations. As Terence Tao sets out, these would include (but not be limited to) p being equal to mv at low velocities, the laws of […]

28 January, 2016 at 5:36 pm

Parsing Ajay Sharma v. E = mc 2 – The Wire – TheNews123

31 July, 2016 at 9:04 am

E=mc^2 – Making Sense of Complications

[…] https://terrytao.wordpress.com/2012/10/02/einsteins-derivation-of-emc2-revisited/ […]

11 December, 2017 at 10:16 pm

Parsing Ajay Sharma v. E = mc2 – Regeneracy

28 October, 2018 at 6:23 am

Parsing Ajay Sharma v. E = mc2 – Panicking

21 February, 2019 at 9:18 pm

Parsing Ajay Sharma v. E = mc2 – Drizzly

26 February, 2019 at 4:42 am

Parsing Ajay Sharma v. E = mc2 – Excitations

21 April, 2019 at 1:09 am

Mike

Old post, sorry for the late comment.

Would just like to make a small correction (applies to your first post on this subject also): Einstein’s original derivation does NOT mention photons or use the Planck-Einstein relation. Rather, it cites the result from his previous SR paper that the energy and frequency of a classical light wave transform in the same way under a Lorentz boost. He obtains the light-energy transformation in the SR paper by: A) transforming the energy density of a plane wave; B) transforming the volume of a sphere enclosing part of the wave and moving along at the speed of light with it; and C) multiplying these two transformations. There’s nothing quantum about it.

24 May, 2019 at 1:51 am

Anonymous

In order to make physics a true mathematical theory (i.e. a part of mathematics – as chemistry is by now a part of physics) one should give rigorous(!) definitions (in terms of the spacetime metric) for all the physical concepts (e.g. elementary particles and their intrinsic quantum parameters like mass, electric charge, spin and also all the classical observer dependent concepts like mass, electric charge, linear and angular momentum).
Clearly, the true mathematical foundations of the current physical theories are still incomplete(!) (the situation of physical theories today is similar to the situation of mathematical theories before the introduction of modern set theory and mathematical logic which provided the needed solid foundation for all currently used mathematical concepts.)

24 May, 2019 at 10:41 am

Anonymous

There is no reason to believe physics can be made into a precise “mathematical theory.” Physical models make assumptions about the world to provide predictions about them. There is no sense in which you can simply have a “true” and physical meaningful concept of, say, mass following purely from mathematical axioms. You ultimately have to make some simplifying assumptions about the world and take some facts for granted if you want to get anywhere serious. Conversely, having a mathematically sound axiomatization of an effective physical theory doesn’t make the theory in any sense “true” as a model for the universe — “the map is not the territory.”

Chemistry can only be seen a part of physics in so far as the reduction to the axioms of quantum mechanics proves practical. In general, the systems studied in chemistry are often too complicated to be approached simply as a quantum many-body systems while also too structured for that approach to be the most fruitful. The parts of chemistry that can be fruitfully addressed as applications of AMO — so-called “quantum chemistry” — are the exceptions.

24 May, 2019 at 11:55 am

Anonymous

On the other hand, there is no reason to be pessimistic about the existence of a precise and rigorous representation of physics as a mathematical model (e.g. by explicit rigorous realization of Tegmark’s “Mathematical universe hypothesis” in which our physical universe is precisely represented by a certain mathematical structure.)
History tells us that pessimism and conservatism often delay the progress of scientific knowledge, so in order to make real fundamental progress on the true mathematical foundation of physics, one should try to seriously consider any logically consistent idea for such a theory (without making unnecessary “simplifying assumptions” – following Newton’s phrase “Hypothesis non fingo”)

28 June, 2020 at 12:54 pm

Anonymous

Dear Terence,
Thank you for your wonderful post.
Does your comment

“But it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on 100% rigorous mathematical reasoning.”

confirm that the equation E = mc2 is an empirical claim about the physical world that could have different in another universe?
Thanks.

14 January, 2021 at 12:14 am

Aziz Lokhandwala

Mr tao,
Your arguments are appreciative. I would like to request you to derive Lorentz transformation by the concepts of four plane systems. In 4-plane system we know that a position of an object is determined by 3 coordinates regardless of 4th one. The reason is intersection of 3 planes is a point. Also, in 2d systems we know that it’s the distance which identifies relationship between coordinates, likewise in 3-plane systems it’s the sum of area and in 4-plane it is volume. For instance, consider a pyramid with triangular base. If a point P is inside this pyramid, and if we know three coordinates we can determine the fourth one by the volume of small pyramids formed inside bigger one. I am having discrete concepts in mind but I am not able to form a rigid block of those discrete ideas to derive Lorentz transformations. I would be thankful if you help me out with the same.

Sincerely,
Aziz Lokhandwala

1 January, 2022 at 5:35 am

Anonymous

That’s not how it happened. He knew that energy and matter were related for existence. He chose the letter c to represent the speed of light because what he really meant was consciousness . But he was not in a world that was ready to accept that and he understood that light equals consciousness

	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Erratum for “An inverse…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on A Banach algebra proof of the…
	Anonymous on A Banach algebra proof of the…
	Aleksandar on 245C, Notes 4: Sobolev sp…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Terence Tao on 245C, Notes 4: Sobolev sp…

Einstein’s derivation of E=mc^2 revisited

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

51 comments

Leave a comment Cancel reply

For commenters

Einstein’s derivation of E=mc^2 revisited

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

51 comments

Leave a comment Cancel reply

For commenters