One of the basic problems in the field of operator algebras is to develop a functional calculus for either a single operator {A}, or a collection {A_1, A_2, \ldots, A_k} of operators. These operators could in principle act on any function space, but typically one either considers complex matrices (which act on a complex finite dimensional space), or operators (either bounded or unbounded) on a complex Hilbert space. (One can of course also obtain analogous results for real operators, but we will work throughout with complex operators in this post.)

Roughly speaking, a functional calculus is a way to assign an operator {F(A)} or {F(A_1,\ldots,A_k)} to any function {F} in a suitable function space, which is linear over the complex numbers, preserve the scalars (i.e. {c(A) = c} when {c \in {\bf C}}), and should be either an exact or approximate homomorphism in the sense that

\displaystyle  FG(A_1,\ldots,A_k) = F(A_1,\ldots,A_k) G(A_1,\ldots,A_k), \ \ \ \ \ (1)

should hold either exactly or approximately. In the case when the {A_i} are self-adjoint operators acting on a Hilbert space (or Hermitian matrices), one often also desires the identity

\displaystyle  \overline{F}(A_1,\ldots,A_k) = F(A_1,\ldots,A_k)^* \ \ \ \ \ (2)

to also hold either exactly or approximately. (Note that one cannot reasonably expect (1) and (2) to hold exactly for all {F,G} if the {A_1,\ldots,A_k} and their adjoints {A_1^*,\ldots,A_k^*} do not commute with each other, so in those cases one has to be willing to allow some error terms in the above wish list of properties of the calculus.) Ideally, one should also be able to relate the operator norm of {f(A)} or {f(A_1,\ldots,A_k)} with something like the uniform norm on {f}. In principle, the existence of a good functional calculus allows one to manipulate operators as if they were scalars (or at least approximately as if they were scalars), which is very helpful for a number of applications, such as partial differential equations, spectral theory, noncommutative probability, and semiclassical mechanics. A functional calculus for multiple operators {A_1,\ldots,A_k} can be particularly valuable as it allows one to treat {A_1,\ldots,A_k} as being exact or approximate scalars simultaneously. For instance, if one is trying to solve a linear differential equation that can (formally at least) be expressed in the form

\displaystyle  F(X,D) u = f

for some data {f}, unknown function {u}, some differential operators {X,D}, and some nice function {F}, then if one’s functional calculus is good enough (and {F} is suitably “elliptic” in the sense that it does not vanish or otherwise degenerate too often), one should be able to solve this equation either exactly or approximately by the formula

\displaystyle  u = F^{-1}(X,D) f,

which is of course how one would solve this equation if one pretended that the operators {X,D} were in fact scalars. Formalising this calculus rigorously leads to the theory of pseudodifferential operators, which allows one to (approximately) solve or at least simplify a much wider range of differential equations than one what can achieve with more elementary algebraic transformations (e.g. integrating factors, change of variables, variation of parameters, etc.). In quantum mechanics, a functional calculus that allows one to treat operators as if they were approximately scalar can be used to rigorously justify the correspondence principle in physics, namely that the predictions of quantum mechanics approximate that of classical mechanics in the semiclassical limit {\hbar \rightarrow 0}.

There is no universal functional calculus that works in all situations; the strongest functional calculi, which are close to being an exact *-homomorphisms on very large class of functions, tend to only work for under very restrictive hypotheses on {A} or {A_1,\ldots,A_k} (in particular, when {k > 1}, one needs the {A_1,\ldots,A_k} to commute either exactly, or very close to exactly), while there are weaker functional calculi which have fewer nice properties and only work for a very small class of functions, but can be applied to quite general operators {A} or {A_1,\ldots,A_k}. In some cases the functional calculus is only formal, in the sense that {f(A)} or {f(A_1,\ldots,A_k)} has to be interpreted as an infinite formal series that does not converge in a traditional sense. Also, when one wishes to select a functional calculus on non-commuting operators {A_1,\ldots,A_k}, there is a certain amount of non-uniqueness: one generally has a number of slightly different functional calculi to choose from, which generally have the same properties but differ in some minor technical details (particularly with regards to the behaviour of “lower order” components of the calculus). This is a similar to how one has a variety of slightly different coordinate systems available to parameterise a Riemannian manifold or Lie group. This is on contrast to the {k=1} case when the underlying operator {A = A_1} is (essentially) normal (so that {A} commutes with {A^*}); in this special case (which includes the important subcases when {A} is unitary or (essentially) self-adjoint), spectral theory gives us a canonical and very powerful functional calculus which can be used without further modification in applications.

Despite this lack of uniqueness, there is one standard choice for a functional calculus available for general operators {A_1,\ldots,A_k}, namely the Weyl functional calculus; it is analogous in some ways to normal coordinates for Riemannian manifolds, or exponential coordinates of the first kind for Lie groups, in that it treats lower order terms in a reasonably nice fashion. (But it is important to keep in mind that, like its analogues in Riemannian geometry or Lie theory, there will be some instances in which the Weyl calculus is not the optimal calculus to use for the application at hand.)

I decided to write some notes on the Weyl functional calculus (also known as Weyl quantisation), and to sketch the applications of this calculus both to the theory of pseudodifferential operators. They are mostly for my own benefit (so that I won’t have to redo these particular calculations again), but perhaps they will also be of interest to some readers here. (Of course, this material is also covered in many other places. e.g. Folland’s “harmonic analysis in phase space“.)

— 1. Weyl quantisation of polynomials —

The simplest class of functions to which one can set up a functional calculus are the polynomials, as this does not require any analytic tools to define. To begin with we will ignore the conjugation structure, thus we will not attempt to implement (2).

In order to be able to freely compose all the operators {A_1,\ldots,A_k} under consideration, we will assume that there is some dense space of test functions (or perhaps Schwartz functions) which is preserved by all of the {A_1,\ldots,A_k}, so that any composition of finitely many of the {A_1,\ldots,A_k} will be densely defined. (Alternatively, one could proceed at a purely formal level for this discussion, working in an abstract algebra generated by the {A_1,\ldots,A_k}.)

In the {k=1} case of a single operator {A}, the polynomial calculus is obvious: given any polynomial

\displaystyle  F(a) := \sum_{i=0}^n c_i a^i

with complex coefficients {c_0,\ldots,c_n}, one can define {F(A)} to be the operator

\displaystyle  F(A) := \sum_{i=0}^n c_i A^i.

In other words, the functional calculus {F \mapsto F(A)} is the linear map that takes each monomial {z^i} to the operator {A^i}. This calculus is of course an exact homomorphism, as it is linear and obeys (1) exactly.

Now we consider the situation with multiple operators. For simplicity, let us just consider the case of two operators {A,B}. We then consider a polynomial

\displaystyle  F(a,b) := \sum_{i=0}^n \sum_{j=1}^m c_{i,j} a^i b^j

with complex coefficients {c_{i,j}}, and ask how to define the operator {F(A,B)}.

The most obvious way to define {F(A,B)} is by direct substitution, which I will call the Kohn-Nirenberg calculus:

\displaystyle  F(A,B)_{KN} := \sum_{i=0}^n \sum_{j=1}^m c_{i,j} A^i B^j.

(Depending on the interpretation of the operators {A,B}, this calculus might also be referred to as the Wick-ordered calculus or normal-ordered calculus.) This is certainly a well-defined calculus, but when {A} and {B} do not commute, the calculus has a bias in that it always places {A} to the left of {B}; in particular, if one interchanges the roles of {a} and {b} then one obtains a different calculus, which one might call the anti-Kohn-Nirenberg calculus:

\displaystyle  F(A,B)_{KN^*} := \sum_{i=0}^n \sum_{j=1}^m c_{i,j} B^j A^i.

Intermediate, and more symmetric, between these two calculi, is the Weyl calculus

\displaystyle  F(A,B)_{W} := \sum_{i=0}^n \sum_{j=1}^m c_{i,j} (A^i B^j)_W

where the Weyl ordering {(A^i B^j)_W} of the monomial {A^i B^j} is defined to be the average of all the {\binom{i+j}{i}} ways to multiply {i} copies of {A} and {j} copies of {B} together:

\displaystyle  (A^i B^j)_W := \frac{1}{\binom{i+j}{i}} \sum_{C_1,\ldots,C_{i+j}} C_1 \ldots C_{i+j}

where {C_1,\ldots,C_{i+j}} range over all tuples which contain {i} copies of {A} and {j} copies of {B}, thus for instance

\displaystyle  (AB)_W = \frac{1}{2}( AB + BA )

\displaystyle  (AB^2)_W = \frac{1}{3}( ABB + BAB + BBA)

\displaystyle  (A^2 B^2)_W = \frac{1}{6} (AABB + ABAB + ABBA

\displaystyle  + BAAB + BABA + BBAA).

and so forth.

Remark 1 Strictly speaking, the use of the terminology {(A^i B^j)_W} here is an abuse of notation, because it suggests a functional relationship between {A^i B^j} and {(A^i B^j)_W} which need not be the case. In particular, if the monomials {A^i B^j} and {A^k B^l} are equal as operators, this does not necessarily imply that the Weyl-ordered monomials {(A^i B^j)_W} and {(A^k B^l)_W} are equal. One could fix this notation by working first with formal symbols {{\bf A}, {\bf B}} generating a free commutative algebra, and writing {({\bf A}^i {\bf B}^j)_W} in place of {(A^i B^j)_W}, so that the Weyl map {F \mapsto F_W} becomes a not-quite-homomorphism from the commutative algebra generated by {{\bf A}} and {{\bf B}} to the non-commutative algebra generated by {A} and {B}. We have chosen however to not be quite so formal, and allow some abuse of notation to simplify the exposition.

When {A} and {B} commute, of course, all these calculi coincide with each other; it is only in the non-commuting case that some distinctions between the calculi emerge. The Weyl calculus may seem complicated, but it has somewhat cleaner formulae with regard to products. For instance, the Weyl calculus works perfectly with respect to powers {(sA+tB+u)^n} of affine combinations {sA+tB+u} of {A,B}, where {s,t,u \in {\bf C}} are scalars:

\displaystyle  ((sA+tB+u)^n)_W = (sA+tB+u)^n. \ \ \ \ \ (3)

(By this equation, we mean that if {F} is the polynomial {F(a,b) := (sa+tb)^n}, then {F(A,B)_W} is the operator {F(A,B)_W = (sA+tB)^n}.) Thus for instance

\displaystyle  ((A+B)^2)_W = (A^2)_W + (2AB)_W + (B^2)_W

\displaystyle  = A^2 + AB + BA + B^2

\displaystyle  = (A+B)^2.

In contrast, the Kohn-Nirenberg calculus (or the anti-Kohn-Nirenberg calculus) distorts this ordering, for instance we have

\displaystyle  ((A+B)^2)_{KN} = A^2 + 2AB + B^2 \neq (A+B)^2

Indeed, by comparing coefficients in (3) and using linearity we see that the identity (3) (for arbitrary {s,t,u,n}) in fact uniquely defines the Weyl calculus. One consequence of this is that the Weyl calculus is not only symmetric with respect to interchange of the underlying operators {A,B}, but in fact respects all linear changes of variable: if {A' = cA + dB+e, B' = fA + gB+h} are scalar affine combinations of {A,B}, {F': {\bf C} \times {\bf C} \rightarrow {\bf C}} is a polynomial of two variables, and {F} is the polynomial {F(a,b) ;= F'(ca+db+e,fa+gb+h)}, then we have that

\displaystyle  F'(A',B')_W = F(A,B)_W.

Indeed, from (3) we see that this identity holds for the mixed monomials {F(a,b) = (sa+tb)^n}, and then by linearity it is true for all polynomials.

One can also extend the Weyl calculus to formal (i.e. not necessarily convergent) infinite series

\displaystyle  F(a,b) := \sum_{i=0}^\infty \sum_{j=1}^\infty c_{i,j} a^i b^j

by declaring {F(A,B)_W} to be the formal series of operators

\displaystyle  F(A,B)_W := \sum_{i=0}^\infty \sum_{j=1}^\infty c_{i,j} (A^i B^j)_W.

In doing so, the identity (3) can be expressed in a very convenient form

\displaystyle  (\exp(sA+tB+u))_W = \exp(sA+tB+u) \ \ \ \ \ (4)

where we view the exponential function {\exp} here as the formal infinite series

\displaystyle  \exp(z) = \sum_{i=0}^\infty \frac{z^i}{i!}. \ \ \ \ \ (5)

Remark 2 The fact that Weyl calculus preserves the exponential map is the reason why we view this calculus as being analogous to normal coordinates in Riemannian geometry, as well as exponential coordinates (of the first kind) on Lie groups, discussed for instance in this previous blog post. In contrast, the Kohn-Nirenberg quantisation gives

\displaystyle  (\exp(sA+tB+u))_{KN} = \exp(sA) \exp(tB) \exp(u)

which is analogous to exponential coordinates of the second kind on Lie groups.

Now we study the extent to which the homomorphism property (1) holds in the Weyl calculus, i.e. we study the discrepancy between {(FG)(A,B)_W} and {F(A,B)_W G(A,B)_W}. When {A} and {B} commute, it is easy to see that (1) holds exactly. In general, these two expressions can be quite different; but when {A} and {B} almost commute, so that the commutator {[A,B] = AB-BA} is non-zero but small, we expect (1) to be approximately true. Motivated by quantum-mechanical examples, we will study a model case when {A} and {B} obey the Heisenberg commutator relationship

\displaystyle  [A,B] = i\hbar \ \ \ \ \ (6)

where {\hbar} is a small positive real number. (In some literature, the sign convention here is reversed.) In this case, we see that any two ways to multiply {j} copies of {A} and {k} copies of {B} together will differ by polynomials of degree strictly less than {j+k}. Among other things, this shows that any polynomial of {A} and {B} can be rewritten into a Weyl-ordered form, by first ordering the top degree terms (at the cost of introducing some messy lower degree terms) and then recursively working on the lower order terms. The same procedure also implies that this Weyl-ordered form is unique. For instance,

\displaystyle  AB = (AB)_W -\frac{1}{2}[A,B] = (AB - \frac{1}{2} i\hbar)_W.

This already implies that for any two polynomials {F,G: {\bf C}^2 \rightarrow {\bf C}}, there must be a unique “product” polynomial {F \ast_\hbar G: {\bf C}^2 \rightarrow {\bf C}} with the property that

\displaystyle  F(A,B)_W G(A,B)_W = (F \ast_\hbar G)(A,B)_W. \ \ \ \ \ (7)

In the classical limit {\hbar=0}, this operation is simply pointwise product:

\displaystyle  F \ast_0 G = FG.

Now we work out what the product operation {\ast_\hbar} (known as the Moyal product) is for non-zero {\hbar}. To avoid some rather messy combinatorics, it turns out to be cleanest to proceed using the formal exponential function (5) applied to various combinations {sA+tB} of {A} and {B}. (This is an instance of the method of generating functions in action.)

We first observe from (6) that we have the general commutation relationship

\displaystyle  [sA+tB, s'A+t'B] = i\hbar \omega( (s,t), (s',t') ) \ \ \ \ \ (8)

where {\omega} is the anti-symmetric form

\displaystyle  \omega( (s,t), (s',t') ) := st' - s't.

This is already the first hint of the correspondence between quantum mechanics (whose dynamics is based on the commutator {\frac{i}{\hbar} [,]}) and classical mechanics (whose dynamics is based on the Poisson bracket {\{f,g\} = \omega( \nabla f, \nabla g)}). From this and the Baker-Campbell-Hausdorff formula, we conclude (formally, at least) that

\displaystyle  \exp(sA+tB) \exp(s'A+t'B) \ \ \ \ \ (9)

\displaystyle  = \exp(sA+tB+s'A+t'B) \exp( \frac{1}{2} i\hbar \omega((s,t),(s',t')) ).

One can make this identity rigorous (at the level of formal power series) as follows. We first observe that the Hadamard lemma

\displaystyle  \exp(X) Y \exp(-X) = \exp(\hbox{ad}(X)) Y

holds at the level of formal power series, where {\hbox{ad}(X)} is the commutator operation {\hbox{ad}(X)(Y) := [X,Y]}. This can be seen by first establishing the formal differentiation identity

\displaystyle  \frac{d}{dt} \exp(tX) Y \exp(-tX) = \hbox{ad}(X)(\exp(tX) Y \exp(-tX))

(which comes from the formal identity {\frac{d}{dt} \exp(tX) = X \exp(tX) = \exp(tX) X} and the product rule) and then using this to solve for {\exp(tX) Y \exp(-tX)} as a formal power series to conclude that

\displaystyle  \exp(tX) Y \exp(-tX) = \exp(t\hbox{ad}(X)) Y,

and then setting {t=1}. Using this lemma together with (8), we see in particular that

\displaystyle  \exp(sA+tB) (s'A+t'B) \exp(-sA-tB) = s'A+t'B + i\hbar \{ (s,t), (s',t') \}. \ \ \ \ \ (10)

Now consider the expression

\displaystyle  C(\lambda) := \exp(\lambda (sA+tB)) \exp(\lambda(s'A+t'B)) \exp( - \frac{\lambda^2}{2} i\hbar \omega((s,t),(s',t')) )

as a formal power series in {\lambda}. Then {C(0)} is the identity, and the formal derivative {C'(\lambda)} can be expressed as

\displaystyle  C'(\lambda) = (sA+tB) \exp(\lambda (sA+tB)) \exp(\lambda(s'A+t'B)) \times

\displaystyle  \times \exp( - \frac{\lambda^2}{2} i\hbar \omega((s,t),(s',t')) )

\displaystyle  + \exp(\lambda (sA+tB)) (s'A+t'B) \exp(\lambda(s'A+t'B))\times

\displaystyle \times \exp( - \frac{\lambda^2}{2} i\hbar \omega((s,t),(s',t')) )

\displaystyle  - \exp(\lambda (sA+tB)) \exp(\lambda(s'A+t'B)) (\lambda i\hbar \{(s,t),(s',t')\}) \times

\displaystyle  \times \exp( - \frac{\lambda^2}{2} i\hbar \omega((s,t),(s',t')) )

which after using (10) simplifies to

\displaystyle  C'(\lambda) = (sA + tB + s'A + t'B) C(\lambda)

which on solving the formal series leads to

\displaystyle  C(\lambda) = \exp( \lambda(sA + tB + s'A + t'B) )

which gives (9) after setting {\lambda=1}.

Combining (9), (5) and (7) (as well as the uniqueness of the Weyl ordering), we conclude that

\displaystyle  \exp(sa+tb) \ast_{\hbar} \exp(s'a+t'b)

\displaystyle  = \exp(sa+tb+s'a+t'b) \exp( \frac{1}{2} i\hbar \omega((s,t),(s',t')) )

where both sides are viewed as formal power series in indeterminates {a,b}. Comparing coefficients, we see that

\displaystyle  (sa+tb)^n \ast_{\hbar} (s'a+t'b)^m

\displaystyle  = \sum_{j=0}^\infty j! \binom{n}{j} \binom{m}{j} (sa+tb)^{n-j} (s'a+t'b)^{m-j} ( \frac{i\hbar}{2} \omega((s,t),(s',t')) )^j

with the convention that {\binom{n}{j}} vanishes for {j>n}, and similarly for {\binom{m}{j}}. We write

\displaystyle  \omega((s,t),(s',t'))^j = \omega^{\otimes j}( (s,t)^{\otimes j}, (s',t')^{\otimes j} ),

where {\omega^{\otimes j}: ({\bf C}^2)^{\otimes j} \times ({\bf C}^2)^{\otimes j} \rightarrow {\bf C}} is the tensor product of {j} copies of {\omega: {\bf C}^2 \times {\bf C}^2 \rightarrow {\bf C}}. Since

\displaystyle  \nabla^j (sa+tb)^n = j! \binom{n}{j} (s,t)^{\otimes j} (sa+tb)^{n-j}

and similarly

\displaystyle  \nabla^j (s'a+t'b)^m = j! \binom{n}{j} (s,t)^{\otimes j} (s'a+t'b)^{m-j},

where {\nabla} denotes the gradient with respect to the {a,b} variables, we conclude that

\displaystyle  (sa+tb)^n \ast_{\hbar} (s'a+t'b)^m

\displaystyle = \sum_{j=0}^\infty \frac{1}{j!} (\frac{i\hbar}{2})^j \omega^{\otimes j}( \nabla^j (sa+tb)^n, \nabla^j (s'a+t'b)^m ).

This formula is valid for all monomials {(sa+tb)^n, (s'a+t'b)^m}, and so by bilinearity we conclude the explicit formula

\displaystyle  F \ast_\hbar G = \sum_{j=0}^\infty \frac{1}{j!} (\frac{i\hbar}{2})^j \omega^{\otimes j}(\nabla^j F, \nabla^j G) \ \ \ \ \ (11)

for the Moyal product, thus

\displaystyle  F \ast_\hbar G = F G + \frac{i\hbar}{2} \omega(\nabla F, \nabla G) - \frac{\hbar^2}{8} \omega^{\otimes 2}(\nabla^2 F, \nabla^2 G)

\displaystyle  - \frac{i\hbar^3}{48} \omega^{\otimes 3}(\nabla^3 F, \nabla^3 G) + \ldots.

The expression {\omega(\nabla F, \nabla G)} is also known as the Poisson bracket of {F} and {G}, and is denoted {\{F,G\}}, thus

\displaystyle  F \ast_\hbar G = FG + \frac{i\hbar}{2} \{F,G\} + \ldots.

Note that {\omega^{\otimes j}} is symmetric for even {j} and anti-symmetric for odd {j}. Subtracting, we conclude that

\displaystyle  F \ast_\hbar G - G \ast_\hbar F = \sum_{j=0}^\infty \frac{2}{(2j+1)!} (\frac{i\hbar}{2})^{2j+1} \omega^{\otimes 2j+1}(\nabla^{2j+1} F, \nabla^{2j+1} G),

thus

\displaystyle  \frac{1}{i\hbar} (F \ast_\hbar G - G \ast_\hbar F) = \{ F,G\} - \frac{\hbar^2}{24} \omega^{\otimes 3}(\nabla^3 F, \nabla^3 G) + \ldots. \ \ \ \ \ (12)

Thus we see that the normalised commutator {\frac{1}{i\hbar} (F \ast_\hbar G - G \ast_\hbar F)} of the Moyal product converges in the semiclassical limit {\hbar \rightarrow 0}; this is one manifestation of the correspondence principle.

The above discussion was for two operators {A,B} obeying the Heisenberg commutation relation (6), but one can check that a similar calculus can also be developed for any tuple {A_1,\ldots,A_k} of operators, and if they obey a Heisenberg-like commutation law

\displaystyle  [ \sum_{i=1}^k s_i A_i, \sum_{j=1}^k t_j A_j] = \omega( (s_1,\ldots,s_k), (t_1,\ldots,t_k) )

for some anti-symmetric form {\omega:{\bf C}^k\times {\bf C}^k \rightarrow {\bf C}}, then the Moyal product formula (11) holds; the arguments are basically identical to the two variable case (except for some notational complications) and are left as an exercise to the reader.

Once one has set up a calculus for complex polynomials {P( a_1,\ldots,a_k)} of some non-self-adjoint operators {A_1,\ldots,A_k}, one can then extend it to polynomials {P(a_1,\ldots,a_k,\overline{a_1},\ldots,\overline{a_k})} of the original arguments {a_1,\ldots,a_k} and their conjugates (or equivalently, to polynomials of the real and imaginary parts of the {a_1,\ldots,a_k}) by using the calculus for {A_1,\ldots,A_k,A_1^*,\ldots,A_k^*}. This allows one to define operators {P(A_1,\ldots,A_k)_W} when {P: ({\bf R}^2)^k \rightarrow {\bf C}} is a polynomial of the real and imaginary parts of {A_1,\ldots,A_k}, basically by substituting {\frac{A_j+A_j^*}{2}} and {\frac{A_j-A_j^*}{2i}} for {\hbox{Re}(A_j)} and {\hbox{Im}(A_j)} respectively. One advantage of the Weyl calculus, not shared in general by the Kohn-Nirenberg calculus, is that the identity (2) holds exactly; this is basically a special case of the affine invariance properties of the Weyl calculus mentioned earlier. In a similar vein, the Weyl calculus ensures that

\displaystyle F(A_1,\ldots,A_k)^*_W = F^*(A_1^*,\ldots,A_k^*)_W

for any complex polynomial {F}, where {F^*(z) := \overline{F(\overline{z})}} is the complex polynomial whose coefficients are the complex conjugates of the coefficients of {F}. In particular, if {A_1,\ldots,A_k} are self-adjoint and {F} has real coefficients, then {F(A_1,\ldots,A_k)_W} is again self-adjoint.

— 2. Weyl quantisation of smooth symbols —

The above algebraic formalism allowed one to define the Weyl quantisation {F(A_1,\ldots,A_k)_W} of any {k}-tuple of operators (preserving some dense space of test functions) as long as {f} was a polynomial; one could also handle the case when {f} was a formal power series, provided that one was willing to make {F(A_1,\ldots,A_k)_W} a formal power series also. But of course we would like to expand the calculus beyond the spaces of polynomials and formal power series.

One obvious way is to work with convergent power series instead of formal power series, but this approach, by definition, only applies for real analytic functions {F}. Also, being real analytic (which, by the root test, can be viewed as an exponential decay hypothesis on the coefficients {c_{i,j}}) may not be enough in some cases to ensure convergence of the formal power series {F(A_1,\ldots,A_k)_W}; one may need faster decay on the {c_{i,j}} (e.g. decay like {1/(i+j)!} or something similar) before one can justify convergence in a suitable sense.

To get beyond the analytic category, it turns out that one can use Fourier-analytic methods, using Fourier expansions as a substitute for power series expansions. The key point, of course, is that Fourier analysis works well for spaces of functions that are significantly rougher than real analytic.

To illustrate the method, let us again work with just two operators {A} and {B}. We will assume that {A} and {B} are (essentially) self-adjoint, which by spectral theory suggests that one should be able to define {F(A,B)_W} for (sufficiently nice) real-valued functions {F: {\bf R} \times {\bf R} \rightarrow {\bf C}}. (The distinction between real and complex functions was not apparent at the polynomial level, since any polynomial on a real vector space can be automatically complexified uniquely to a polynomial on the associated complex vector space.) The starting point is the Fourier inversion formula

\displaystyle  F(a,b) = \int_{\bf R} \int_{\bf R} \hat F(s,t) e^{i (a s + bt)}\ ds dt

where {\hat F(s,t)} is the Fourier transform, which in our choice of normalisations becomes

\displaystyle  \hat F(s,t) = \frac{1}{(2\pi)^2} \int_{\bf R} \int_{\bf R} F(a,b) e^{-i(as+bt)}\ da db.

From the inversion formula, it is natural to expect that the Weyl quantisation {F(A,B)_W} of {F(A,B)} should be given by the formula

\displaystyle  F(A,B)_W = \int_{\bf R} \int_{\bf R} \hat F(s,t) (e^{i (A s + Bt)})_W\ ds dt

which in view of (5) leads us to

\displaystyle  F(A,B)_W := \int_{\bf R} \int_{\bf R} \hat F(s,t) e^{i (A s + Bt)}\ ds dt \ \ \ \ \ (13)

which we will take as a definition of {F(A,B)_W} for general functions {F: {\bf R} \times {\bf R} \rightarrow {\bf C}}.

Of course, to make this definition rigorous, one has to specify some regularity hypotheses on {F}, and check that the integral actually converges in some suitable sense; also, it would be reassuring to verify that the definition is compatible with the polynomial calculus given earlier. At present, {A} and {B} are completely abstract operators, which makes this task quite difficult; so we will now work with a much more concrete situation, namely that of the one-dimensional position operator {X} and momentum operator {D}, defined on test functions in {L^2({\bf R})} by the formulae

\displaystyle  X f(x) := xf(x), Df(x) := \frac{\hbar}{i} \frac{d}{dx} f(x).

Note that this obeys the Heisenberg relation

\displaystyle  [X,D] = i \hbar

and thus by (9) one has (formally, at least)

\displaystyle  \exp(isX + itD) = \exp( i\hbar st/2) \exp(isX) \exp(itD)

for any {s,t \in {\bf R}}. On the other hand, we have

\displaystyle  \exp(isX) f(x) = \exp(isx) f(x)

while from solving the transport equation {\frac{\partial}{\partial t} f(t,x) = iD f(t,x)} we see that

\displaystyle  \exp(itD) f(x) = f(x+\hbar t)

for suitable test functions {f}, and thus

\displaystyle \exp(isX + itD) f(x) = \exp( i\hbar st/2) \exp( isx ) f(x+\hbar t).

Again working formally, we conclude that

\displaystyle  F(X,D)_W f(x) = \int_{\bf R} \int_{\bf R} \hat F(s,t) \exp( i\hbar st/2) \exp( isx ) f(x+\hbar t)\ ds dt

which we expand as

\displaystyle \frac{1}{(2\pi)^2} \int_{\bf R} \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\tilde x,p) \exp(-is\tilde x-itp) \exp( i\hbar st/2) \exp( isx )

\displaystyle  f(x+\hbar t)\ ds dt d\tilde x dp.

Making the change of variables {t = \frac{y-x}{\hbar}}, we rewrite this as

\displaystyle \frac{1}{(2\pi)^2 \hbar} \int_{\bf R} \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\tilde x,p) \exp(-is\tilde x-i(y-x)p/\hbar) \exp( is(x+y)/2)

\displaystyle  f(y)\ ds dy d\tilde x dp.

Next, from the distributional Fourier inversion formula

\displaystyle  \frac{1}{2\pi} \int_{\bf R} \exp(isu)\ ds = \delta(u)

where {\delta} is the Dirac delta, we can rewrite this as

\displaystyle \frac{1}{2\pi \hbar} \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\tilde x,p) \exp(-i(y-x)p/\hbar) \delta( \tilde x - (x+y)/2) f(y)\ dy d\tilde x dp

which on performing the {\tilde x} integral simplifies to

\displaystyle \frac{1}{2\pi \hbar} \int_{\bf R} \int_{\bf R} F(\frac{x+y}{2},p) \exp(-i(y-x)p/\hbar)\ dy dp

or after the change of variables {p = 2\pi \hbar \xi},

\displaystyle  F(X,D)_W f(x) = \int_{\bf R} (\int_{\bf R} F(\frac{x+y}{2}, 2\pi \hbar \xi) e^{-2\pi i (y-x) \xi} f(y)\ dy)d\xi. \ \ \ \ \ (14)

We remark that the Kohn-Nirenberg calculus {F(X,D)_{KN}} comes out almost identically, except that the argument {\frac{x+y}{2}} is replaced with {x}. Similarly for the anti-Kohn-Nirenberg calculus (in which {\frac{x+y}{2}} is replaced by {y}). The Weyl calculus treats the input variable {y} and the output variable {x} on equal footing, whereas the Kohn-Nirenberg calculus prefers one to the other. In particular, we have the convenient formal identity

\displaystyle  F(X,D)_W^* = \overline{F}(X,D)_W \ \ \ \ \ (15)

so that {F(X,D)_W} formally self-adjoint when {F} is real-valued.

We will take the formula (14) as our definition of the Weyl quantisation {F(X,D)_W f(x)}, at least when {f} is a Schwarz function and {F} obeys the weak regularity bounds

\displaystyle  |\nabla_x^i \nabla_p^j F(x,p)| \leq C_{i,j} (1+|x|+|p|)^C \ \ \ \ \ (16)

for all {i,j \geq 0}, all {x,p \in {\bf R}}, some constant {C>0}, and some constants {C_{i,j}>0} depending on {i,j}. (This is weaker than the symbol bounds that usually arise in the theory of pseudodifferential operators, but will suffice for the discussion here.)

Exercise 1 Show that for {F} and {f} as above, the integrand {F(\frac{x+y}{2}, 2\pi \hbar \xi) e^{-2\pi i (y-x) \xi}} is absolutely integrable in {y}, and that the integrand {\int_{\bf R} F(\frac{x+y}{2}, 2\pi \hbar \xi) e^{-2\pi i (y-x) \xi}\ dy} is absolutely integrable in {\xi}, making {F(X,D)_W f(x)} well-defined. Show in fact that {F(X,D)_W f} is a Schwartz function.

Exercise 2 Show that the above definition agrees with the previous definitions of {\exp(sX+tD)_W} and {(X^i D^j)_W}. (One approach is by direct computation; another is to try to make the above formal calculations fully rigorous.)

Exercise 3 (Translation invariance) If {F} obeys the weak regularity bound (16) and {F_{x_0,p_0}(x,p) := F(x-x_0,p-p_0)} for some {x_0,p_0 \in {\bf R}}, show that

\displaystyle  F'(X,D)_W = \mu_{p_0} \tau_{x_0} F(X,D)_W \tau_{x_0}^{-1} \mu_{p_0}^{-1}

where {\tau_{x_0} = \exp(-ix_0D/\hbar)} is the translation operator

\displaystyle  \tau_{x_0} f(x) := f(x-x_0)

and {\mu_{p_0} = \exp(ip_0X/\hbar)} is the modulation operator

\displaystyle  \mu_{p_0} f(x) := e^{i p_0x /\hbar} f(x).

One can in fact obtain similar symmetries for the entire Weil representation of the (affine) metaplectic group, but we will not do so here.

One can get some intuition as to what the operators {F(X,D)_W} do by testing them on gaussian wave packets

\displaystyle  g_{x_0,p_0}(x) := e^{-(x-x_0)^2/\hbar} e^{i p_0 x/\hbar}.

It turns out that the Taylor series of {F} around {(x_0,p_0)} is a good approximation for the action of {F(X,D)_W} on such a packet:

Lemma 1 For any {k \geq 0}, one has

\displaystyle  F(X,D)_W g_{x_0,p_0} = \sum_{i+j \leq k} \frac{\partial_x^i \partial_p^j F(x_0,p_0)}{i!j!} ( (X-x_0)^i (D-p_0)^j )_W g_{x_0,p_0} \ \ \ \ \ (17)

\displaystyle  + O_{k,L^2}(\hbar^{k-O(1)})

as {\hbar \rightarrow 0}, where {O_{k,L^2}(\hbar^{k-O(1)})} denotes a quantity with {L^2} norm of {O_k(\hbar^{k-O(1)})}.

Thus we have the asymptotic series

\displaystyle  F(X,D)_W g_{x_0,p_0} = F(x_0,p_0) g_{x_0,p_0} + (\partial_x F(x_0,p_0)) (X-x_0) g_{x_0,p_0}

\displaystyle + (\partial_p F(x_0,p_0)) (D-p_0) g_{x_0,p_0}

\displaystyle  + \frac{1}{2} (\partial_{xx} F(x_0,p_0)) (X-x_0)^2 g_{x_0,p_0}

\displaystyle  + \frac{1}{2} (\partial_{pp} F(x_0,p_0)) (D-p_0)^2 g_{x_0,p_0}

\displaystyle  + (\partial_{xp} F(x_0,p_0)) ((X-x_0)(D-p_0)) g_{x_0,p_0} + \ldots

with each term of order {k} being of {L^2} norm {O_k(\hbar^{k-O(1)})}. (Note that this does not imply that the series is convergent in {L^2}, because we have no control of the dependence on {k} of the implied constant in the {O_k()} notation.)

Proof: (Sketch) By using the symmetries in Exercise 3, we may normalise {x_0=p_0=0}. If {F} is a polynomial of degree at most {k}, then the the two sides of (17) agree exactly, with no need for the error term {O_{k,L^2}(\hbar^{k-O(1)})}. By subtracting off the order {k} Taylor expansion, we thus see that it suffices to show that

\displaystyle  F(X,D)_W g_{0,0} = O_{k,L^2}( \hbar^{k-O(1)} )

whenever {F} vanishes to order at least {k} at the origin. But this can be verified from (14) and a routine stationary phase computation. \Box

The same computation allows one to replace the {L^2} norm by any other of the Frechet norms in the Schwartz space, as long as {x_0, p_0} are assumed to be of polynomial size in {\hbar}, i.e. {x_0, p_0 = O(\hbar^{-O(1)})}.

From (14), we see that any quantisation {F(X,D)_W} is formally an integral operator

\displaystyle  Tf(x) = \int_{\bf R} K(x,y) f(y)\ dy \ \ \ \ \ (18)

with distributional kernel

\displaystyle  K(x,y) := \int_{\bf R} F(\frac{x+y}{2}, 2\pi \hbar \xi) e^{-2\pi i (y-x) \xi}\ d\xi.

Applying the Fourier inversion formula, we thus see (formally, at least) that any integral operator (18) is also a quantisation {F(X,D)_W} where {F} is uniquely determined by the formula

\displaystyle  F(x,2\pi \hbar \xi) := \int_{\bf R} K( x-a/2, x+a/2 ) e^{2\pi i a \xi}\ da. \ \ \ \ \ (19)

In particular, given two quantisations {F(X,D)_W} and {G(X,D)_W}, their composition {F(X,D)_W G(X,D)_W} is formally an integral operator with distributional kernel

\displaystyle  K(x,y) := \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\frac{x+z}{2}, 2\pi \hbar \xi) G(\frac{z+y}{2}, 2\pi \hbar \eta) e^{-2\pi i (z-x) \xi} e^{-2\pi i (y-z)\eta}

\displaystyle  \ d\xi d\eta dz

and hence is formally a quantisation {F \ast_\hbar G(X,D)_W}, where

\displaystyle  F \ast_\hbar G( x_0, 2\pi \hbar \xi_0) := \int_{\bf R} \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\frac{x_0+z-a/2}{2}, 2\pi \hbar \xi) \times

\displaystyle  \times G(\frac{z+x_0+a/2}{2}, 2\pi \hbar \eta) e^{-2\pi i (z-x_0+a/2) \xi} e^{-2\pi i (x_0+a/2-z)\eta} e^{2\pi i a \xi_0}\ d\xi d\eta dz da.

Thus, for instance,

\displaystyle  F \ast_\hbar G( 0, 0) = \int_{\bf R} \int_{\bf R} \int_{\bf R} \int_{\bf R} F(\frac{z-a/2}{2}, 2\pi \hbar \xi) \times

\displaystyle  \times G(\frac{z+a/2}{2}, 2\pi \hbar \eta) e^{-2\pi i (z+a/2) \xi} e^{-2\pi i (a/2-z)\eta}\ d\xi d\eta da dz.

This integral will, in general, not be absolutely integrable. However, the phase {(\xi,\eta,a,z) \mapsto (z+a/2)\xi + (a/2-z)\eta} is only stationary at {(\xi,\eta,a,z) = (0,0,0,0)}. By using a suitable smooth dyadic decomposition and many integrations by parts, we can then show that this integral can be given a convergent interpretation if {F,G} obey the weak regularity bound (16). Similarly for other evaluations {F \ast_\hbar G(x_0,2\pi \hbar \xi_0)}, as well as higher derivatives of {F \ast_\hbar G} evaluated at arbitrary points {(x_0,2\pi\hbar\xi_0)}. Indeed, with a certain amount of technical effort (which we omit here) we see that {F \ast_\hbar G} is well-defined and obeys (16), and we can rigorously establish the composition law

\displaystyle  (F \ast_\hbar G)(X,D)_W = F(X,D)_W G(X,D)_W. \ \ \ \ \ (20)

If {F} and {G} are polynomials, we see (using the fact that any operator has a unique symbol {F}) that the product {\ast_\hbar} given here coincides exactly with the Moyal product (11). As a consequence, by testing on gaussian wave packets using Lemma 1, one can show that this composition law {\ast_\hbar} has an asymptotic series given by the Moyal product, thus for any {k}, one has

\displaystyle  F \ast_\hbar G = \sum_{j=0}^k \frac{1}{j!} (\frac{i\hbar}{2})^j \omega^{\otimes j}(\nabla^j F, \nabla^j G) + \hbar^{k-O(1)} E_{k,\hbar}

where the error {E_{k,\hbar}} obeys the bound (16) (with the constant {C} independent of {k} and {\hbar}, and the constant {C_{i,j}} depending on {k} but uniform in {\hbar}). We will not give full details here, but roughly speaking, bound {E_{k,\hbar}} to high order at some point {(x_0, 2\pi \hbar \xi_0)}, one can subtract off a Taylor polynomial from both {F} and {G} to reduce to the case when at least one of {F,G} vanishes to high order at {(x_0, 2\pi \hbar \xi_0)}, at which point Lemma 1 shows that {F \ast_\hbar G(X,D)_W} almost annihilates any gaussian wave packet supported near {(x_0, 2\pi \hbar \xi_0)} in phase space, and thus must have very small symbol at {(x_0,2\pi \hbar \xi_0)}; a similar (but more complicated) argument also allows one to control derivatives of the symbol of {F \ast_\hbar G}. In a similar vein, the commutator {\frac{1}{\hbar} [F(X,D)_W, G(X,D)_W]} of two operators {F(X,D)_W, G(X,D)_W} with symbols {F,G} obeying (16) can be described by a symbol with asymptotic series (12); we omit the details. These facts form the basis for the pseudodifferential calculus, which we will not develop further here; see e.g. Taylor’s book or Folland’s book for more details.

We can also relate the Moyal product to the Wigner transform

\displaystyle  W(f)(x, 2\pi\hbar \xi) := \int_{\bf R} f( x-a/2) \overline{f}(x+a/2 ) e^{-2\pi i a \xi}\ da.

Comparing this with (19), we see that {W(f)} can also be interpreted as the symbol of the rank one operator

\displaystyle  T_f: g \mapsto \langle g, f \rangle f

where {\langle g,f \rangle := \int_{\bf R} g(x) \overline{f(x)}\ dx}. Given a quantised operator {F(X,D)_W}, we (formally, at least) have the relatino

\displaystyle  F(X,D)_W T_f \overline{F}(X,D)_W = T_{F(X,D)_W f}

and hence (by (15) and (20)) we see (again formally, at least) that

\displaystyle  W( F(X,D)_W f ) = F \ast_\hbar W(f) \ast_\hbar \overline{F},

so that the action of pseudodifferential operators such as {F(X,D)_W} can be described through the Wigner transform by some applications of the Moyal product.

Continuing this line of thought, suppose that {\psi} is a time-dependent function obeying Schrödinger’s equation

\displaystyle  i\hbar \partial_t \psi = H(X,D)_W \psi

for some real-valued symbol {H} (representing the Hamiltonian of the system). Passing from the wave function {\psi} to the density matrix {T_\psi}, we observe that Schrödinger’s equation implies Heisenberg’s equation

\displaystyle  \partial_t T_\psi = \frac{1}{i\hbar} [H(X,D)_W, T_\psi]

and hence (by formal applicaton of the Moyal product)

\displaystyle  \partial_t W(\psi) = \frac{1}{i\hbar} ( H \ast_\hbar W(\psi) - W(\psi) \ast_\hbar H ).

Applying (12), we see that the Wigner transform of {\psi} obeys the perturbed Hamiltonian flow equation

\displaystyle  \partial_t W(\psi) = \{ H, W(\psi) \} - \frac{\hbar^2}{24} \omega^{\otimes 3}(\nabla^3 H, \nabla^3 W(\psi)) + \ldots. \ \ \ \ \ (21)

This should be compared with the evolution of a classical density function {\rho: {\bf R} \times {\bf R} \rightarrow {\bf R}^+} in phase space with respect to transport along Hamilton’s equation of motion

\displaystyle  \dot x = \partial_\xi H; \quad \dot \xi = - \partial_x H,

which is given by the transport equation

\displaystyle  \partial_t \rho = \{ H,\rho\}. \ \ \ \ \ (22)

Thus we see in the semiclassical limit {\hbar \rightarrow 0}, the equation of motion for the Wigner transform (21) under the Schrödinger flow converges formally to the equation of motion for a classical density transported under Hamiltonian flow. This is one of the basic manifestations of the correspondence principle relating quantum and classical mechanics, and suggests that the Wigner transform of a wave function is a quantum analogue of a density function in phase space. (But the analogy is not perfect; in particular, the Wigner transform is almost never a non-negative function. See Folland’s book for more discussion.) Making all of these formal calculations and heuristics rigorous is a non-trivial task, requiring the development of the theory of Fourier integral operators, and will not be discussed here.