From Bose-Einstein condensates to the nonlinear Schrodinger equation

26 November, 2009 in expository, math.AP, math.MP | Tags: BBGKY hierarchy, Bose-Einstein condensates, Gross-Pitaevskii equation, Gross-Pitaevskii hierarchy, NLS, quantum mechanics, Schrodinger equation | by Terence Tao

The Schrödinger equation

$\displaystyle i \hbar \partial_t |\psi \rangle = H |\psi\rangle$

is the fundamental equation of motion for (non-relativistic) quantum mechanics, modeling both one-particle systems and ${N}$ -particle systems for ${N>1}$ . Remarkably, despite being a linear equation, solutions ${|\psi\rangle}$ to this equation can be governed by a non-linear equation in the large particle limit ${N \rightarrow \infty}$ . In particular, when modeling a Bose-Einstein condensate with a suitably scaled interaction potential ${V}$ in the large particle limit, the solution can be governed by the cubic nonlinear Schrödinger equation

$\displaystyle i \partial_t \phi = \Delta \phi + \lambda |\phi|^2 \phi. \ \ \ \ \ (1)$

I recently attended a talk by Natasa Pavlovic on the rigorous derivation of this type of limiting behaviour, which was initiated by the pioneering work of Hepp and Spohn, and has now attracted a vast recent literature. The rigorous details here are rather sophisticated; but the heuristic explanation of the phenomenon is fairly simple, and actually rather pretty in my opinion, involving the foundational quantum mechanics of ${N}$ -particle systems. I am recording this heuristic derivation here, partly for my own benefit, but perhaps it will be of interest to some readers.

This discussion will be purely formal, in the sense that (important) analytic issues such as differentiability, existence and uniqueness, etc. will be largely ignored.

— 1. A quick review of classical mechanics —

The phenomena discussed here are purely quantum mechanical in nature, but to motivate the quantum mechanical discussion, it is helpful to first quickly review the more familiar (and more conceptually intuitive) classical situation.

Classical mechanics can be formulated in a number of essentially equivalent ways: Newtonian, Hamiltonian, and Lagrangian. The formalism of Hamiltonian mechanics for a given physical system can be summarised briefly as follows:

The physical system has a phase space ${\Omega}$ of states ${\vec x}$ (which is often parameterised by position variables ${q}$ and momentum variables ${p}$ ). Mathematically, it has the structure of a symplectic manifold, with some symplectic form ${\omega}$ (which would be ${\omega = dp \wedge dq}$ if one had position and momentum coordinates available).
The complete state of the system at any given time ${t}$ is given (in the case of pure states) by a point ${\vec x(t)}$ in the phase space ${\Omega}$ .
Every physical observable (e.g., energy, momentum, position, etc.) ${A}$ is associated to a function (also called ${A}$ ) mapping the phase space ${\Omega}$ to the range of the observable (e.g. for real observables, ${A}$ would be a function from ${\Omega}$ to ${{\mathbb R}}$ ). If one measures the observable ${A}$ at time ${t}$ , one will obtain the measurement ${A(x(t))}$ .
There is a special observable, the Hamiltonian ${H: \Omega \rightarrow {\mathbb R}}$ , which governs the evolution of the state ${\vec x(t)}$ through time, via Hamilton’s equations of motion. If one has position and momentum coordinates ${\vec x(t) = (q_i(t), p_i(t))_{i=1}^n}$ , these equations are given by the formulae
$\displaystyle \partial_t p_i = - \frac{\partial H}{\partial q_i}; \partial_t q_i = \frac{\partial H}{\partial p_i};$

more abstractly, just from the symplectic form ${\omega}$ on the phase space, the equations of motion can be written as

$\displaystyle \partial_t \vec x(t) = - \nabla_\omega H(\vec x(t)), \ \ \ \ \ (2)$

where ${\nabla_\omega H}$ is the symplectic gradient of ${H}$ .

Hamilton’s equation of motion can also be expressed in a dual form in terms of observables ${A}$ , as Poisson’s equation of motion

$\displaystyle \partial_t A(\vec x(t)) = - \{ H, A \}(\vec x(t))$

for any observable ${A}$ , where ${\{H,A\} := \nabla_\omega H \cdot \nabla A}$ is the Poisson bracket. One can express Poisson’s equation more abstractly as

$\displaystyle \partial_t A = -\{H,A\}. \ \ \ \ \ (3)$

In the above formalism, we are assuming that the system is in a pure state at each time ${t}$ , which means that it only occupies a single point ${\vec x(t)}$ in phase space. One can also consider mixed states in which the state of the system at a time ${t}$ is not fully known, but is instead given by a probability distribution ${\rho(t,\vec x)\ dx}$ on phase space. The act of measuring an observable ${A}$ at a time ${t}$ will thus no longer be deterministic, but will itself be a random variable, whose expectation ${\langle A \rangle}$ is given by

$\displaystyle \langle A \rangle(t) = \int_\Omega A(\vec x) \rho(t,\vec x)\ d\vec x. \ \ \ \ \ (4)$

The equation of motion of a mixed state ${\rho}$ is given by the advection equation

$\displaystyle \partial_t \rho = \hbox{div}( \rho \nabla_\omega H )$

using the same vector field ${-\nabla_\omega H}$ that appears in (2); this equation can also be derived from (3), (4), and a duality argument.

Pure states can be viewed as the special case of mixed states in which the probability distribution ${\rho(t,\vec x)\ d\vec x}$ is a Dirac mass ${\delta_{\vec x(t)}(\vec x)}$ . (We ignore for now the formal issues of how to perform operations such as derivatives on Dirac masses; this can be accomplished using the theory of distributions (or, equivalently, by working in the dual setting of observables) but this is not our concern here.) One can thus think of mixed states as continuous averages of pure states, or equivalently the space of mixed states is the convex hull of the space of pure states.

Suppose one had a ${2}$ -particle system, in which the joint phase space ${\Omega = \Omega_1 \times \Omega_2}$ is the product of the two one-particle phase spaces. A pure joint state is then a point ${x = (\vec x_1,\vec x_2)}$ in ${\Omega}$ , where ${\vec x_1}$ represents the state of the first particle, and ${\vec x_2}$ is the state of the second particle. If the joint Hamiltonian ${H: \Omega \rightarrow {\mathbb R}}$ split as

$\displaystyle H(\vec x_1,\vec x_2) = H_1(\vec x_1) + H_2(\vec x_2)$

then the equations of motion for the first and second particles would be completely decoupled, with no interactions between the two particles. However, in practice, the joint Hamiltonian contains coupling terms between ${\vec x_1, \vec x_2}$ that prevents one from totally decoupling the system; for instance, one may have

$\displaystyle H(\vec x_1,\vec x_2) = \frac{|p_1|^2}{2m_1} + \frac{|p_2|^2}{2m_2} + V(q_1-q_2),$

where ${\vec x_1=(q_1,p_1)}$ , ${\vec x_2=(q_2,p_2)}$ are written using position coordinates ${q_i}$ and momentum coordinates ${p_i}$ , ${m_1,m_2 > 0}$ are constants (representing mass), and ${V(q_1-q_2)}$ is some interaction potential that depends on the spatial separation ${q_1-q_2}$ between the two particles.

In a similar spirit, a mixed joint state is a joint probability distribution ${\rho(\vec x_1,\vec x_2)\ d\vec x_1 d\vec x_2}$ on the product state space. To recover the (mixed) state of an individual particle, one must consider a marginal distribution such as

$\displaystyle \rho_1(\vec x_1) := \int_{\Omega_2} \rho(\vec x_1,\vec x_2) \ d\vec x_2$

(for the first particle) or

$\displaystyle \rho_2(\vec x_2) := \int_{\Omega_1} \rho(\vec x_1,\vec x_2) \ d\vec x_1$

(for the second particle). Similarly for ${N}$ -particle systems: if the joint distribution of ${N}$ distinct particles is given by ${\rho(\vec x_1,\ldots,\vec x_N)\ \vec dx_1 \ldots \vec dx_N}$ , then the distribution of the first particle (say) is given by

$\displaystyle \rho_1(\vec x_1) = \int_{\Omega_2 \times \ldots \times \Omega_N} \rho(\vec x_1,\vec x_2,\ldots,\vec x_N)\ d\vec x_2 \ldots d\vec x_N,$

the distribution of the first two particles is given by

$\displaystyle \rho_{12}(\vec x_1,\vec x_2) = \int_{\Omega_3 \times \ldots \times \Omega_N} \rho(\vec x_1,\vec x_2,\ldots,\vec x_N)\ d\vec x_3 \ldots d\vec x_N,$

and so forth.

A typical Hamiltonian in this case may take the form

$\displaystyle H(\vec x_1,\ldots,\vec x_n) = \sum_{j=1}^N \frac{|p_j|^2}{2m_j} + \sum_{1 \leq j < k \leq N} V_{jk}(q_j-q_k)$

which is a combination of single-particle Hamiltonians ${H_j}$ and interaction perturbations. If the momenta ${p_j}$ and masses ${m_j}$ are normalised to be of size ${O(1)}$ , and the potential ${V_{jk}}$ has an average value (i.e. an ${L^1}$ norm) of ${O(1)}$ also, then the former sum has size ${O(N)}$ and the latter sum has size ${O(N^2)}$ , so the latter will dominate. In order to balance the two components and get a more interesting limiting dynamics when ${N \rightarrow \infty}$ , we shall therefore insert a normalising factor of ${\frac{1}{N}}$ on the right-hand side, giving a Hamiltonian

$\displaystyle H(\vec x_1,\ldots,\vec x_n) = \sum_{j=1}^N \frac{|p_j|^2}{2m_j} + \frac{1}{N} \sum_{1 \leq j < k \leq N} V_{jk}(q_j-q_k).$

Now imagine a system of ${N}$ indistinguishable particles. By this, we mean that all the state spaces ${\Omega_1 = \ldots = \Omega_N}$ are identical, and all observables (including the Hamiltonian) are symmetric functions of the product space ${\Omega = \Omega_1^N}$ (i.e. invariant under the action of the symmetric group ${S_N}$ ). In such a case, one may as well average over this group (since this does not affect any physical observable), and assume that all mixed states ${\rho}$ are also symmetric. (One cost of doing this, though, is one has to largely give up pure states ${(\vec x_1,\ldots,\vec x_N)}$ , since such states will not be symmetric except in the very exceptional case ${\vec x_1=\ldots=\vec x_N}$ .)

A typical example of a symmetric Hamiltonian is

$\displaystyle H(\vec x_1,\ldots,\vec x_n) = \sum_{j=1}^N \frac{|p_j|^2}{2m} + \frac{1}{N} \sum_{1 \leq j < k \leq N} V(q_j-q_k)$

where ${V}$ is even (thus all particles have the same individual Hamiltonian, and interact with the other particles using the same interaction potential). In many physical systems, it is natural to consider only short-range interaction potentials, in which the interaction between ${q_j}$ and ${q_k}$ is localised to the region ${q_j-q_k=O(r)}$ for some small ${r}$ . We model this by considering Hamiltonians of the form

$\displaystyle H(\vec x_1,\ldots,\vec x_n) = \sum_{j=1}^N H(\vec x_j) + \frac{1}{N} \sum_{1 \leq j < k \leq N} \frac{1}{r^d} V(\frac{\vec x_j-\vec x_k}{r})$

where ${d}$ is the ambient dimension of each particle (thus in physical models, ${d}$ would usually be ${3}$ ); the factor of ${\frac{1}{r^d}}$ is a normalisation factor designed to keep the ${L^1}$ norm of the interaction potential of size ${O(1)}$ . It turns out that an interesting limit occurs when ${r}$ goes to zero as ${N}$ goes to infinity by some power law ${r = N^{-\beta}}$ ; imagine for instance ${N}$ particles of “radius” ${r}$ bouncing around in a box, which is a basic model for classical gases.

An important example of a symmetric mixed state is a factored state

$\displaystyle \rho(\vec x_1,\ldots,\vec x_N) = \rho_1(\vec x_1) \ldots \rho_1(\vec x_N)$

where ${\rho_1}$ is a single-particle probability density function; thus ${\rho}$ is the tensor product of ${N}$ copies of ${\rho_1}$ . If there are no interaction terms in the Hamiltonian, then Hamiltonian’s equation of motion will preserve the property of being a factored state (with ${\rho_1}$ evolving according to the one-particle equation); but with interactions, the factored nature may be lost over time.

— 2. A quick review of quantum mechanics —

Now we turn to quantum mechanics. This theory is fundamentally rather different in nature than classical mechanics (in the sense that the basic objects, such as states and observables, are a different type of mathematical object than in the classical case), but shares many features in common also, particularly those relating to the Hamiltonian and other observables. (This relationship is made more precise via the correspondence principle, and more precise still using semi-classical analysis.)

The formalism of quantum mechanics for a given physical system can be summarised briefly as follows:

The physical system has a phase space ${{\Bbb H}}$ of states ${|\psi\rangle}$ (which is often parameterised as a complex-valued function of the position space). Mathematically, it has the structure of a complex Hilbert space, which is traditionally manipulated using bra-ket notation.
The complete state of the system at any given time ${t}$ is given (in the case of pure states) by a unit vector ${|\psi(t)\rangle}$ in the phase space ${{\Bbb H}}$ .
Every physical observable ${A}$ is associated to a linear operator on ${{\Bbb H}}$ ; real-valued observables are associated to self-adjoint linear operators. If one measures the observable ${A}$ at time ${t}$ , one will obtain the random variable whose expectation ${\langle A \rangle}$ is given by ${\langle \psi(t) | A | \psi(t) \rangle}$ . (The full distribution of ${A}$ is given by the spectral measure of ${A}$ relative to ${|\psi(t)\rangle}$ .)
There is a special observable, the Hamiltonian ${H: {\Bbb H} \rightarrow {\Bbb H}}$ , which governs the evolution of the state ${|\psi(t)\rangle}$ through time, via Schrödinger’s equations of motion
$\displaystyle i\hbar \partial_t |\psi(t)\rangle = H |\psi(t) \rangle. \ \ \ \ \ (5)$

Schrödinger’s equation of motion can also be expressed in a dual form in terms of observables ${A}$ , as Heisenberg’s equation of motion

$\displaystyle \partial_t \langle \psi | A | \psi \rangle = \frac{i}{\hbar} \langle \psi | [ H, A ] | \psi \rangle$

or more abstractly as

$\displaystyle \partial_t A = \frac{i}{\hbar} [ H, A ] \ \ \ \ \ (6)$

where ${[,]}$ is the commutator or Lie bracket (compare with (3)).

The states ${|\psi\rangle}$ are pure states, analogous to the pure states ${x}$ in Hamiltonian mechanics. One also has mixed states ${\rho}$ in quantum mechanics. Whereas in classical mechanics, a mixed state ${\rho}$ is a probability distribution (a non-negative function of total mass ${\int_\Omega \rho = 1}$ ), in quantum mechanics a mixed state is a non-negative (i.e. positive semi-definite) operator ${\rho}$ on ${{\Bbb H}}$ of total trace ${\hbox{tr} \rho = 1}$ . If one measures an observable ${A}$ at a mixed state ${\rho}$ , one obtains a random variable with expectation ${\hbox{tr} A \rho}$ . From (6) and duality, one can infer that the correct equation of motion for mixed states must be given by

$\displaystyle \partial_t \rho = \frac{i}{\hbar} [H,\rho]. \ \ \ \ \ (7)$

One can view pure states as the special case of mixed states which are rank one projections,

$\displaystyle \rho= |\psi\rangle \langle \psi|.$

Morally speaking, the space of mixed states is the convex hull of the space of pure states (just as in the classical case), though things are a little trickier than this when the phase space ${{\Bbb H}}$ is infinite dimensional, due to the presence of continuous spectrum in the spectral theorem.

Pure states suffer from a phase ambiguity: a phase rotation ${e^{i\theta} |\psi\rangle}$ of a pure state ${|\psi \rangle}$ leads to the same mixed state, and the two states cannot be distinguished by any physical observable.

In a single particle system, modeling a (scalar) quantum particle in a ${d}$ -dimensional position space ${{\mathbb R}^d}$ , one can identify the Hilbert space ${{\Bbb H}}$ with ${L^2({\mathbb R}^d \rightarrow {\mathbb C})}$ , and describe the pure state ${|\psi\rangle}$ as a wave function ${\psi: {\mathbb R}^d \rightarrow {\mathbb C}}$ , which is normalised as

$\displaystyle \int_{{\mathbb R}^d} |\psi(x)|^2\ dx = 1$

as ${|\psi\rangle}$ has to be a unit vector. (If the quantum particle has additional features such as spin, then one needs a fancier wave function, but let’s ignore this for now.) A mixed state is then a function ${\rho: {\mathbb R}^d \times {\mathbb R}^d \rightarrow {\mathbb C}}$ which is Hermitian (i.e. ${\rho(x,x') = \overline{\rho(x',x)}}$ ) and positive definite, with unit trace ${\int_{{\mathbb R}^d} \rho(x,x)\ dx = 1}$ ; a pure state ${\psi}$ corresponds to the mixed state ${\rho(x,x') = \psi(x) \overline{\psi(x')}}$ .

A typical Hamiltonian in this setting is given by the operator

$\displaystyle H \psi(x) := \frac{|p|^2}{2m} \psi(x) + V(x) \psi(x)$

where ${m > 0}$ is a constant, ${p}$ is the momentum operator ${p := -i \hbar \nabla_x}$ , and ${\nabla_x}$ is the gradient in the ${x}$ variable (so ${|p|^2 = -\hbar^2 \Delta_x}$ , where ${\Delta_x}$ is the Laplacian; note that ${\nabla_x}$ is skew-adjoint and should thus be thought of as being imaginary rather than real), and ${V: {\mathbb R}^d \rightarrow {\mathbb R}}$ is some potential. Physically, this depicts a particle of mass ${m}$ in a potential well given by the potential ${V}$ .

Now suppose one has an ${N}$ -particle system of scalar particles. A pure state of such a system can then be given by an ${N}$ -particle wave function ${\psi: ({\mathbb R}^d)^N \rightarrow {\mathbb C}}$ , normalised so that

$\displaystyle \int_{({\mathbb R}^d)^N} |\psi(x_1,\ldots,x_N)|^2\ dx_1 \ldots dx_N = 1$

and a mixed state is a Hermitian positive semi-definite function ${\rho: ({\mathbb R}^d)^N \times ({\mathbb R}^d)^N \rightarrow {\mathbb C}}$ with trace

$\displaystyle \int_{({\mathbb R}^d)^N} \rho(x_1,\ldots,x_N; x_1,\ldots,x_N)\ dx_1 \ldots dx_N = 1,$

with a pure state ${\psi}$ being identified with the mixed state

$\displaystyle \rho(x_1,\ldots,x_N; x'_1,\ldots,x'_N) := \psi(x_1,\ldots,x_N) \overline{\psi(x'_1,\ldots,x'_N)}.$

In classical mechanics, the state of a single particle was the marginal distribution of the joint state. In quantum mechanics, the state of a single particle is instead obtained as the partial trace of the joint state. For instance, the state of the first particle is given as

$\displaystyle \rho_1(x_1; x'_1) := \int_{({\mathbb R}^d)^{N-1}} \rho(x_1,x_2,\ldots,x_N; x'_1,x_2,\ldots,x_N)\ dx_2 \ldots dx_N,$

the state of the first two particles is given as

$\displaystyle \rho_{12}(x_1,x_2; x'_1,x'_2) := \int_{({\mathbb R}^d)^{N-2}} \rho(x_1,x_2,x_3,\ldots,x_N;$

$\displaystyle x'_1,x'_2,x_3,\ldots,x_N)\ dx_3 \ldots dx_N,$

and so forth. (These formulae can be justified by considering observables of the joint state that only affect, say, the first two position coordinates ${x_1,x_2}$ and using duality.)

A typical Hamiltonian in this setting is given by the operator

$\displaystyle H \psi(x_1,\ldots,x_N) = \sum_{j=1}^N \frac{|p_j|^2}{2m_j} \psi(x_1,\ldots,x_N)$

$\displaystyle + \frac{1}{N} \sum_{1 \leq j < k \leq N} V_{jk}(x_j-x_k) \psi(x_1,\ldots,x_N)$

where we normalise just as in the classical case, and ${p_j := -i \hbar\nabla_{x_j}}$ .

An interesting feature of quantum mechanics – not present in the classical world – is that even if the ${N}$ -particle system is in a pure state, individual particles may be in a mixed state: the partial trace of a pure state need not remain pure. Because of this, when considering a subsystem of a larger system, one cannot always assume that the subsystem is in a pure state, but must work instead with mixed states throughout, unless there is some reason (e.g. a lack of coupling) to assume that pure states are somehow preserved.

Now consider a system of ${N}$ indistinguishable quantum particles. As in the classical case, this means that all observables (including the Hamiltonian) for the joint system are invariant with respect to the action of the symmetric group ${S_N}$ . Because of this, one may as well assume that the (mixed) state of the joint system is also symmetric with respect to this action. In the special case when the particles are bosons, one can also assume that pure states ${|\psi\rangle}$ are also symmetric with respect to this action (in contrast to fermions, where the action on pure states is anti-symmetric). A typical Hamiltonian in this setting is given by the operator

$\displaystyle H \psi(x_1,\ldots,x_N) = \sum_{j=1}^N \frac{|p_j|^2}{2m} \psi(x_1,\ldots,x_N)$

$\displaystyle + \frac{1}{N} \sum_{1 \leq j < k \leq N} V(x_j-x_k) \psi(x_1,\ldots,x_N)$

for some even potential ${V}$ ; if one wants to model short-range interactions, one might instead pick the variant

$\displaystyle H \psi(x_1,\ldots,x_N) = \sum_{j=1}^N \frac{|p_j|^2}{2m} \psi(x_1,\ldots,x_N) + \frac{1}{N} \sum_{1 \leq j < k \leq N} r^d V(\frac{x_j-x_k}{r}) \psi(x_1,\ldots,x_N) \ \ \ \ \ (8)$

for some ${r>0}$ . This is a typical model for an ${N}$ -particle Bose-Einstein condensate. (Longer-range models can lead to more non-local variants of NLS for the limiting equation, such as the Hartree equation.)

— 3. NLS —

Suppose we have a Bose-Einstein condensate given by a (symmetric) mixed state

$\displaystyle \rho(t, x_1,\ldots,x_N; x'_1,\ldots,x'_N )$

evolving according to the equation of motion (7) using the Hamiltonian (8). One can take a partial trace of the equation of motion (7) to obtain an equation for the state ${\rho_1(t, x_1; x'_1)}$ of the first particle (note from symmetry that all the other particles will have the same state function). If one does take this trace, one soon finds that the equation of motion becomes

$\displaystyle \partial_t \rho_1(t,x_1;x'_1) = \frac{i}{\hbar} [ (\frac{|p_1|^2}{2m} - \frac{|p'_1|^2}{2m}) \rho_1(t,x_1;x'_1)$

$\displaystyle + \frac{1}{N} \sum_{j=2}^N \int_{{\mathbb R}^d} \frac{1}{r^d} [ V( \frac{x_1 - x_j}{r} ) - V( \frac{x'_1 -x_j}{r} ) ] \rho_{1j}(t,x_1,x_j; x'_1,x_j)\ dx_j$

where ${\rho_{1j}}$ is the partial trace to the ${1, j}$ particles. Using symmetry, we see that all the summands in the ${j}$ summation are identical, so we can simplify this as

$\displaystyle \partial_t \rho_1(t,x_1;x'_1) = \frac{i}{\hbar} [ (\frac{|p_1|^2}{2m} - \frac{|p'_1|^2}{2m}) \rho_1(t,x_1;x'_1)$

$\displaystyle + \frac{N-1}{N} \int_{{\mathbb R}^d} \frac{1}{r^d} [ V( \frac{x_1 - x_2}{r} ) - V( \frac{x'_1 -x_2}{r} ) ] \rho_{12}(t,x_1,x_2; x'_1,x_2)\ dx_2.$

This does not completely describe the dynamics of ${\rho_1}$ , as one also needs an equation for ${\rho_{12}}$ . But one can repeat the same argument to get an equation for ${\rho_{12}}$ involving ${\rho_{123}}$ , and so forth, leading to a system of equations known as the BBGKY hierarchy. But for simplicity we shall just look at the first equation in this hierarchy.

Let us now formally take two limits in the above equation, sending the number of particles ${N}$ to infinity and the interaction scale ${r}$ to zero. The effect of sending ${N}$ to infinity should simply be to eliminate the ${\frac{N-1}{N}}$ factor. The effect of sending ${r}$ to zero should be to send ${\frac{1}{r^d} V(\frac{x}{r})}$ to the Dirac mass ${\lambda \delta(x)}$ , where ${\lambda := \int_{{\mathbb R}^d} V}$ is the total mass of ${V}$ . Formally performing these two limits, one is led to the equation

$\displaystyle \partial_t \rho_1(t,x_1;x'_1) = \frac{i}{\hbar} [ (\frac{|p_1|^2}{2m} - \frac{|p'_1|^2}{2m}) \rho_1(t,x_1;x'_1)$

$\displaystyle + \lambda (\rho_{12}(t,x_1,x_1;x'_1,x_1) - \rho_{12}(t,x_1,x'_1;x'_1,x'_1)) ].$

One can perform a similar formal limiting procedure for the other equations in the BBGKY hierarchy, obtaining a system of equations known as the Gross-Pitaevskii hierarchy.

We next make an important simplifying assumption, which is that in the limit ${N \rightarrow \infty}$ any two particles in this system become decoupled, which means that the two-particle mixed state factors as the tensor product of two one-particle states:

$\displaystyle \rho_{12}(t,x_1,x_2; x'_1,x_2) \approx \rho_1(t,x_1;x'_1) \rho_1(t,x_2;x'_2).$

One can view this as a mean field approximation, modeling the interaction of one particle ${x_1}$ with all the other particles by the mean field ${\rho_1}$ .

Making this assumption, the previous equation simplifies to

$\displaystyle \partial_t \rho_1(t,x_1;x'_1) = \frac{i}{\hbar} [ (\frac{|p_1|^2}{2m} - \frac{|p'_1|^2}{2m})$

$\displaystyle + \lambda (\rho_1(t,x_1;x_1) - \rho_1(t,x'_1;x'_1))] \rho_1(t,x_1;x'_1).$

If we assume furthermore that ${\rho_1}$ is a pure state, thus

$\displaystyle \rho_1(t,x_1;x'_1) = \psi(t,x_1) \overline{\psi(t,x'_1)}$

then (up to the phase ambiguity mentioned earlier), ${\psi(t,x)}$ obeys the Gross-Pitaevskii equation

$\displaystyle \partial_t \psi(t,x) = \frac{i}{\hbar} [ (\frac{|p|^2}{2m} + \lambda |\psi(t,x)|^2 ] \psi(t,x)$

which (up to some factors of ${\hbar}$ and ${m}$ , which can be renormalised away) is essentially (1).

An alternate derivation of (1), using a slight variant of the above mean field approximation, comes from studying the Hamiltonian (8). Let us make the (very strong) assumption that at some fixed time ${t}$ , one is in a completely factored pure state

$\displaystyle \psi(x_1,\ldots,x_N) = \psi_1(x_1) \ldots \psi_1(x_N),$

where ${\psi_1}$ is a one-particle wave function, in particular obeying the normalisation

$\displaystyle \int_{{\mathbb R}^d} |\psi_1(x)|^2\ dx = 1.$

(This is an unrealistically strong version of the mean field approximation. In practice, one only needs the two-particle partial traces to be completely factored for the discussion below.) The expected value of the Hamiltonian,

$\displaystyle \langle \psi|H|\psi\rangle = \int_{({\mathbb R}^d)^N} \psi(x_1,\ldots,x_N) \overline{H \psi(x_1,\ldots,x_N)}\ dx_1 \ldots dx_N,$

can then be simplified as

$\displaystyle N \int_{{\mathbb R}^d} \psi_1(x) \overline{\frac{|p_1|^2}{2m} \psi_1(x)}\ dx$

$\displaystyle + \frac{N-1}{2} \int_{{\mathbb R}^d \times {\mathbb R}^d} r^{-d} V(\frac{x_1-x_2}{r}) |\psi_1(x_1)|^2 |\psi_1(x_2)|\ dx_1 dx_2.$

Again sending ${r \rightarrow 0}$ , this formally becomes

$\displaystyle N \int_{{\mathbb R}^d} \psi_1(x) \overline{\frac{|p_1|^2}{2m} \psi_1(x)}\ dx + \frac{N-1}{2} \lambda \int_{{\mathbb R}^d \times {\mathbb R}^d} |\psi_1(x_1)|^4\ dx_1$

which in the limit ${N \rightarrow \infty}$ is asymptotically

$\displaystyle N \int_{{\mathbb R}^d} \psi_1(x) \overline{\frac{|p_1|^2}{2m} \psi_1(x)} + \frac{\lambda}{2} |\psi_1(x_1)|^4\ dx_1.$

Up to some normalisations, this is the Hamiltonian for the NLS equation (1).

There has been much progress recently in making the above derivations precise, by Erdös-Schlein-Yau, Klainerman-Machedon, Kirkpatrick-Schlein-Staffilani, Chen-Pavlovic, and others. A key step is to show that the Gross-Pitaevskii hierarchy necessarily preserves the property of being a completely factored state. This requires a uniqueness theory for this hierarchy, which is surprisingly delicate, due to the fact that it is a system of infinitely many coupled equations over an unbounded number of variables.

[Update, Dec 8: Interestingly, the above heuristic derivation only works when the interaction scale ${r}$ is much larger than ${N^{-1}}$ . For ${r \sim N^{-1}}$ , the coupling constant ${\lambda}$ acquires a nonlinear correction, becoming essentially the scattering length of the potential rather than its mean. See comments below.]

14 comments

Comments feed for this article

26 November, 2009 at 10:25 pm

Anonymous

Professor Tao:
In the third paragraph, do you mean to say that the discussion will be purely informal?

26 November, 2009 at 11:39 pm

Anonymous

Mathematicians tend to use the word “formal” to describe an argument in which symbolic manipulations may not be justified rigorously. (You just cross your fingers and hope for the best.)

27 November, 2009 at 2:46 am

Mio

Dear Prof. Tao, looks like there’s a typo in (7), A should be H instead. Also, the ket above (6) is missing a \psi inside. Thanks for the post.

[Corrected, thanks – T.]

28 November, 2009 at 2:28 pm

M.S.

Really beautiful and clear, as your other post!
It made me enjoy my long train trip today, thanks.

I saw a typo immediately before the introduction of the interaction potential sum normalization, I think it should be:
If the momenta $p_j$ and masses $m_j$ are normalised to be of size

[Corrected, thanks – T.]

29 November, 2009 at 3:53 pm

Prof. Tao–There seems to be a missing equation in your definition of Hamilton’s equations on a sympletic manifold, right after “the equations of motion can be written as”.

29 November, 2009 at 3:58 pm

Prof. Tao–Actually, it looks like all the numbered equations are having difficulties being displayed. (At least I don’t see them running firefox on Ubuntu.)

[Hmm, a strange glitch – I think the equations are restored now. -T]

30 November, 2009 at 7:41 am

liuyao

minor typo: $\rho(x, x')=\psi(x)\bar{\psi(x')}$ , the prime on x, not on $\psi$
Momentum is usually identified with $-i\hbar\nabla$ , though the minus sign is immaterial when you square it, and is more of a convention.
Great post, by the way!

[Corrected, thanks – T.]

1 December, 2009 at 1:40 am

A semana nos arXivs… « Ars Physica

[…] From Bose-Einstein condensates to the nonlinear Schrodinger equation […]

2 December, 2009 at 2:29 pm

John Sidles

Please let me echo the above comments in saying that this is a wonderfully interesting and enjoyable post!

I would like to offer three comments on how engineering students might read (and mis-read) this post, recornizing that increasing numbers of engineering students are seeking to upgrade their mathematical understanding.

None of the following remarks should be construed as being in any respect critical of Prof. Tao’s fine essay. Rather, they should be read as fan mail—-and as an expression of thanks—from the engineering community to the mathematical community.

One of Bjarne Stroustrup’s maxims is “Whenever something can be done in two ways, someone will be confused.” And when it comes to quantum mechanics—with its plethora of invariances and conventions–few people are more easily confused than literal-minded engineering students!

Engineering students can become confused in ways that might not occur to mathematicians, as follows:

(1) When discussing dynamical state-spaces endowed with a metric and/or symplectic structure, is it better to give equations in terms of vectors, or in terms of forms? Mathematicians are happy either way, but they tend to choose vector frameworks (as Tao’s essay does), perhaps for the reason that vectors are easier to sketch than forms.

However, if we have in mind (sooner or later) to pullback dynamical equations onto lower-dimension, noneuclidean manifolds (as engineers ubiquitously do), then it is convenient to express dynamical equations (and complex structures, etc.) in terms of forms rather than vectors … and it helps engineering students to be reminded that forms pullback naturally and vectors don’t.

This boils down to assuring students that symplectic gradients can be defined to map functions to vectors, or alternatively map functions to forms, with equal validity (given a symplectic and/or metric structure that establishes a natural isomorphism).

(2) On the arxiv server there is a (unpublished, but very clear) essay by Prof. Tao titled Perelman’s proof of the Poincare conjecture: a nonlinear PDE perspective (arXiv.org:math/0610903). In particular, footnote three of this article is in itself a short yet powerfully thought-provoking essay to the effect that “a PDE flow is in many ways ‘dumber’ than a combinatorial algorithm than a combinatorial algorithm” and yet “if the flow is sufficiently geometrical in nature then the flows acquire a number of deep and delicate additional properties”.

In quantum mechanics as in topology, there is steadily increasing use of flow/PDE algorithms in conjunction with combinatorical/algebraic algorithms; an essay on this general topic would be *very* welcome (IMHO) to many students/researchers in quantum mechanics (in engineering and otherwise).

(3) Quantum mechanics has a reputation for being mysterious, and in particular, there is a widespread impression that its basic postulates are inviolate. But as is often the case with no-go arguments, a loophole exists that Prof. Tao’s present essay illustrates beautifully.

That loophole is that (in Prof. Tao’s words) “despite [quantum dynamics] being a linear equation, solutions can be governed by a non-linear equation”. Thus we are free to invent nonlinear versions of quantum mechanics, without fear of experimental contradiction, provided that we can derive the nonlinear dynamics from linear quantum mechanics.

This principle applies broadly in quantum mechanics and many other physical theories; for example there is a recent article by Stephen Adler and Angelo Bassi titled Is Quantum Theory Exact? that can be read as an another example of this same general principle.

Here too an essay on “Mathematical methods for circumventing no-go arguments in physical theories” would be very interesting—and very stimulating too!—to many students.

That’s all! And thanks also, to everyone who contributes what is becoming (IMHO) the present-day “Golden Era of Mathematical Blogging”. :)

8 December, 2009 at 10:29 am

Bob Jerrard

nice post. just a few days ago I saw a talk by Laszlo Erdos on some of his work (with Schlein and Yau and others) on these problems, and he emphasized that the correct value of the coupling constant $\lambda$ is not the total mass of the interaction potential $V$ , but rather is $8\pi a_0$ , where $a_0$ is the scattering length, defined as follows: consider a solution $f$ of the equation $-\Delta f + \frac 12 V f = 0$ in $\mathbb{R}^d$ , such that $f\to 1$ at $\infty$ . Then if the potential $V$ is sufficiently short-range, it is a fact that $w = 1-f$ is asymptotic to $-a_0/|x|$ for some $a_0>0$ (for example this is clear if $V$ is compactly supported), and the constant $a_0$ is defined to be the scattering length.

In order to see the scattering length appear in the Gross-Pitaevsky equation, one needs to modify the product state ansatz you have given above. The modified ansatz has the form

$\Psi(x) = \prod_{j<k} (1 - w(x_j -x_k)) \prod_{j=1}^N \psi(x_j),$

writing it for wave functions rather than density matrices, and so takes into account short-range correlations between particles. If I understand correctly, the definitions of $a_0$ and $w$ imply that

$\int |\nabla(1-w)|^2+ \frac 12V(1-w)^2 = 8\pi a_0,$

and it is via this fact that the above modified ansatz gives rise to $8\pi a_0$ in the GP equation.

The justification of the limit thus requires establishing some information about short-range repulsive interactions between particles.

8 December, 2009 at 12:15 pm

Terence Tao

That’s a very interesting subtlety! I think it shows up for some ranges of r and N and not others, in particular if the range r of the potential is significantly longer than the mean spacing of the particles then the naive approximation should work, I think (this is for instance the case in the Chen-Pavlovic work, where the potential is rather long range and the nonlinear correction does not appear).

It is good to have examples of why one should not always trust naive limiting arguments, though…

8 December, 2009 at 6:44 pm

H.S.

Thank you for this wonderful post. I just have a couple of questions:
1) For classical interacting system, what is the “interesting limit” you mentioned in the post? Could you please explain more about that limit? For instance is there a nonlinear equation in the limit? The power law of r and N is kind of mysterious to me. What’s the value of the exponent there explicitly?
2) In quantum case, has the Mean field approximation that the two-particle mixed state can factorize in the limit been rigorously justified? [In classical case at positive temperature, there’re “propagation of chaos” type results, which can justify mean field approximation].
3) I feel like, usually, mean field approximation is valid only for long range interactions. But here we are concerned about short range interactions. Well, this is not actually a question, maybe my feeling is just incorrect.
Thank you again!

8 December, 2009 at 7:07 pm

Terence Tao

(1) I don’t have a formal derivation, but it seems to me that the limiting dynamics of the classical model should be governed by the one-particle advection equation but with an effective Hamiltonian containing a potential term proportional to the spatial density $\rho(q) := \int \rho(q,p) dp$ . There is undoubtedly a name for this type of nonlinear kinetic equation but it escapes me at the moment. This type of limit should obtain whenever r is much smaller than 1 but much larger than 1/N; for r=1/N I suppose one should have a correction in analogy with the quantum case as pointed out by Bob Jerrard above.

(2) As far as I am aware, most of the rigorous results require the initial state to already be factored or close to factored, and the conclusion is that this near-factored property is more or less preserved (with dynamics given by the effective equation). There are certainly efforts to generalise to broader classes of data, though.

(3) There is a parallel set of results for long-range interactions, in which the nonlinearity becomes nonlocal (of Hartree type, generally).

26 September, 2014 at 4:10 am

My great WordPress blog – Econlinks

[…] Tao makes a nice and concise exposition of some of the most beautiful parts at the intersection of Mathematics and Theoretical Physics (oh, nostalgia…), including quick reviews of classical and quantum […]

	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Pointwise ergodic theorems for…
	Anonymous on 275A, Notes 3: The weak and st…
	Terence Tao on Pointwise ergodic theorems for…
	Terence Tao on Erratum for “An inverse…
	Anonymous on Notes on the B+B+t theore…
	Anonymous on Pointwise ergodic theorems for…
	Anonymous on Erratum for “An inverse…
	Erratum for “A… on An inverse theorem for the Gow…
	Anonymous on Analysis II
	Anonymous on Notes on the B+B+t theore…
	Anonymous on Twisted convolution and the se…
	Anonymous on A generalized Cauchy-Schwarz i…
	Notes on the B+B+t t… on Ultrafilters, nonstandard anal…

From Bose-Einstein condensates to the nonlinear Schrodinger equation

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

14 comments

Leave a comment Cancel reply

For commenters

From Bose-Einstein condensates to the nonlinear Schrodinger equation

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

14 comments

Leave a comment Cancel reply

For commenters