You are currently browsing the category archive for the ‘math.QA’ category.

In contrast to previous notes, in this set of notes we shall focus exclusively on Fourier analysis in the one-dimensional setting {d=1} for simplicity of notation, although all of the results here have natural extensions to higher dimensions. Depending on the physical context, one can view the physical domain {{\bf R}} as representing either space or time; we will mostly think in terms of the former interpretation, even though the standard terminology of “time-frequency analysis”, which we will make more prominent use of in later notes, clearly originates from the latter.

In previous notes we have often performed various localisations in either physical space or Fourier space {{\bf R}}, for instance in order to take advantage of the uncertainty principle. One can formalise these operations in terms of the functional calculus of two basic operations on Schwartz functions {{\mathcal S}({\bf R})}, the position operator {X: {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} defined by

\displaystyle  (Xf)(x) := x f(x)

and the momentum operator {D: {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})}, defined by

\displaystyle  (Df)(x) := \frac{1}{2\pi i} \frac{d}{dx} f(x). \ \ \ \ \ (1)

(The terminology comes from quantum mechanics, where it is customary to also insert a small constant {h} on the right-hand side of (1) in accordance with de Broglie’s law. Such a normalisation is also used in several branches of mathematics, most notably semiclassical analysis and microlocal analysis, where it becomes profitable to consider the semiclassical limit {h \rightarrow 0}, but we will not emphasise this perspective here.) The momentum operator can be viewed as the counterpart to the position operator, but in frequency space instead of physical space, since we have the standard identity

\displaystyle  \widehat{Df}(\xi) = \xi \hat f(\xi)

for any {\xi \in {\bf R}} and {f \in {\mathcal S}({\bf R})}. We observe that both operators {X,D} are formally self-adjoint in the sense that

\displaystyle  \langle Xf, g \rangle = \langle f, Xg \rangle; \quad \langle Df, g \rangle = \langle f, Dg \rangle

for all {f,g \in {\mathcal S}({\bf R})}, where we use the {L^2({\bf R})} Hermitian inner product

\displaystyle  \langle f, g\rangle := \int_{\bf R} f(x) \overline{g(x)}\ dx.

Clearly, for any polynomial {P(x)} of one real variable {x} (with complex coefficients), the operator {P(X): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} is given by the spatial multiplier operator

\displaystyle  (P(X) f)(x) = P(x) f(x)

and similarly the operator {P(D): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} is given by the Fourier multiplier operator

\displaystyle  \widehat{P(D) f}(\xi) = P(\xi) \hat f(\xi).

Inspired by this, if {m: {\bf R} \rightarrow {\bf C}} is any smooth function that obeys the derivative bounds

\displaystyle  \frac{d^j}{dx^j} m(x) \lesssim_{m,j} \langle x \rangle^{O_{m,j}(1)} \ \ \ \ \ (2)

for all {j \geq 0} and {x \in {\bf R}} (that is to say, all derivatives of {m} grow at most polynomially), then we can define the spatial multiplier operator {m(X): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} by the formula

\displaystyle  (m(X) f)(x) := m(x) f(x);

one can easily verify from several applications of the Leibniz rule that {m(X)} maps Schwartz functions to Schwartz functions. We refer to {m(x)} as the symbol of this spatial multiplier operator. In a similar fashion, we define the Fourier multiplier operator {m(D)} associated to the symbol {m(\xi)} by the formula

\displaystyle  \widehat{m(D) f}(\xi) := m(\xi) \hat f(\xi).

For instance, any constant coefficient linear differential operators {\sum_{k=0}^n c_k \frac{d^k}{dx^k}} can be written in this notation as

\displaystyle \sum_{k=0}^n c_k \frac{d^k}{dx^k} =\sum_{k=0}^n c_k (2\pi i D)^k;

however there are many Fourier multiplier operators that are not of this form, such as fractional derivative operators {\langle D \rangle^s = (1- \frac{1}{4\pi^2} \frac{d^2}{dx^2})^{s/2}} for non-integer values of {s}, which is a Fourier multiplier operator with symbol {\langle \xi \rangle^s}. It is also very common to use spatial cutoffs {\psi(X)} and Fourier cutoffs {\psi(D)} for various bump functions {\psi} to localise functions in either space or frequency; we have seen several examples of such cutoffs in action in previous notes (often in the higher dimensional setting {d>1}).

We observe that the maps {m \mapsto m(X)} and {m \mapsto m(D)} are ring homomorphisms, thus for instance

\displaystyle  (m_1 + m_2)(D) = m_1(D) + m_2(D)

and

\displaystyle  (m_1 m_2)(D) = m_1(D) m_2(D)

for any {m_1,m_2} obeying the derivative bounds (2); also {m(D)} is formally adjoint to {\overline{m(D)}} in the sense that

\displaystyle  \langle m(D) f, g \rangle = \langle f, \overline{m}(D) g \rangle

for {f,g \in {\mathcal S}({\bf R})}, and similarly for {m(X)} and {\overline{m}(X)}. One can interpret these facts as part of the functional calculus of the operators {X,D}, which can be interpreted as densely defined self-adjoint operators on {L^2({\bf R})}. However, in this set of notes we will not develop the spectral theory necessary in order to fully set out this functional calculus rigorously.

In the field of PDE and ODE, it is also very common to study variable coefficient linear differential operators

\displaystyle  \sum_{k=0}^n c_k(x) \frac{d^k}{dx^k} \ \ \ \ \ (3)

where the {c_0,\dots,c_n} are now functions of the spatial variable {x} obeying the derivative bounds (2). A simple example is the quantum harmonic oscillator Hamiltonian {-\frac{d^2}{dx^2} + x^2}. One can rewrite this operator in our notation as

\displaystyle  \sum_{k=0}^n c_k(X) (2\pi i D)^k

and so it is natural to interpret this operator as a combination {a(X,D)} of both the position operator {X} and the momentum operator {D}, where the symbol {a: {\bf R} \times {\bf R} \rightarrow {\bf C}} this operator is the function

\displaystyle  a(x,\xi) := \sum_{k=0}^n c_k(x) (2\pi i \xi)^k. \ \ \ \ \ (4)

Indeed, from the Fourier inversion formula

\displaystyle  f(x) = \int_{\bf R} \hat f(\xi) e^{2\pi i x \xi}\ d\xi

for any {f \in {\mathcal S}({\bf R})} we have

\displaystyle  (2\pi i D)^k f(x) = \int_{\bf R} (2\pi i \xi)^k \hat f(\xi) e^{2\pi i x \xi}\ d\xi

and hence on multiplying by {c_k(x)} and summing we have

\displaystyle (\sum_{k=0}^n c_k(X) (2\pi i D)^k) f(x) = \int_{\bf R} a(x,\xi) \hat f(\xi) e^{2\pi i x \xi}\ d\xi.

Inspired by this, we can introduce the Kohn-Nirenberg quantisation by defining the operator {a(X,D) = a_{KN}(X,D): {\mathcal S}({\bf R}) \rightarrow {\mathcal S}({\bf R})} by the formula

\displaystyle  a(X,D) f(x) = \int_{\bf R} a(x,\xi) \hat f(\xi) e^{2\pi i x \xi}\ d\xi \ \ \ \ \ (5)

whenever {f \in {\mathcal S}({\bf R})} and {a: {\bf R} \times {\bf R} \rightarrow {\bf C}} is any smooth function obeying the derivative bounds

\displaystyle  \frac{\partial^j}{\partial x^j} \frac{\partial^l}{\partial \xi^l} a(x,\xi) \lesssim_{a,j,l} \langle x \rangle^{O_{a,j}(1)} \langle \xi \rangle^{O_{a,j,l}(1)} \ \ \ \ \ (6)

for all {j,l \geq 0} and {x \in {\bf R}} (note carefully that the exponent in {x} on the right-hand side is required to be uniform in {l}). This quantisation clearly generalises both the spatial multiplier operators {m(X)} and the Fourier multiplier operators {m(D)} defined earlier, which correspond to the cases when the symbol {a(x,\xi)} is a function of {x} only or {\xi} only respectively. Thus we have combined the physical space {{\bf R} = \{ x: x \in {\bf R}\}} and the frequency space {{\bf R} = \{ \xi: \xi \in {\bf R}\}} into a single domain, known as phase space {{\bf R} \times {\bf R} = \{ (x,\xi): x,\xi \in {\bf R} \}}. The term “time-frequency analysis” encompasses analysis based on decompositions and other manipulations of phase space, in much the same way that “Fourier analysis” encompasses analysis based on decompositions and other manipulations of frequency space. We remark that the Kohn-Nirenberg quantization is not the only choice of quantization one could use; see Remark 19 below.

Exercise 1

  • (i) Show that for {a} obeying (6), that {a(X,D)} does indeed map {{\mathcal S}({\bf R})} to {{\mathcal S}({\bf R})}.
  • (ii) Show that the symbol {a} is uniquely determined by the operator {a(X,D)}. That is to say, if {a,b} are two functions obeying (6) with {a(X,D) f = b(X,D) f} for all {f \in {\mathcal S}({\bf R})}, then {a=b}. (Hint: apply {a(X,D)-b(X,D)} to a suitable truncation of a plane wave {x \mapsto e^{2\pi i x \xi}} and then take limits.)

In principle, the quantisations {a(X,D)} are potentially very useful for such tasks as inverting variable coefficient linear operators, or to localize a function simultaneously in physical and Fourier space. However, a fundamental difficulty arises: map from symbols {a} to operators {a(X,D)} is now no longer a ring homomorphism, in particular

\displaystyle  (a_1 a_2)(X,D) \neq a_1(X,D) a_2(X,D) \ \ \ \ \ (7)

in general. Fundamentally, this is due to the fact that pointwise multiplication of symbols is a commutative operation, whereas the composition of operators such as {X} and {D} does not necessarily commute. This lack of commutativity can be measured by introducing the commutator

\displaystyle  [A,B] := AB - BA

of two operators {A,B}, and noting from the product rule that

\displaystyle  [X,D] = -\frac{1}{2\pi i} \neq 0.

(In the language of Lie groups and Lie algebras, this tells us that {X,D} are (up to complex constants) the standard Lie algebra generators of the Heisenberg group.) From a quantum mechanical perspective, this lack of commutativity is the root cause of the uncertainty principle that prevents one from simultaneously localizing in both position and momentum past a certain point. Here is one basic way of formalising this principle:

Exercise 2 (Heisenberg uncertainty principle) For any {x_0, \xi_0 \in {\bf R}} and {f \in \mathcal{S}({\bf R})}, show that

\displaystyle  \| (X-x_0) f \|_{L^2({\bf R})} \| (D-\xi_0) f\|_{L^2({\bf R})} \geq \frac{1}{4\pi} \|f\|_{L^2({\bf R})}^2.

(Hint: evaluate the expression {\langle [X-x_0, D - \xi_0] f, f \rangle} in two different ways and apply the Cauchy-Schwarz inequality.) Informally, this exercise asserts that the spatial uncertainty {\Delta x} and the frequency uncertainty {\Delta \xi} of a function obey the Heisenberg uncertainty relation {\Delta x \Delta \xi \gtrsim 1}.

Nevertheless, one still has the correspondence principle, which asserts that in certain regimes (which, with our choice of normalisations, corresponds to the high-frequency regime), quantum mechanics continues to behave like a commutative theory, and one can sometimes proceed as if the operators {X,D} (and the various operators {a(X,D)} constructed from them) commute up to “lower order” errors. This can be formalised using the pseudodifferential calculus, which we give below the fold, in which we restrict the symbol {a} to certain “symbol classes” of various orders (which then restricts {a(X,D)} to be pseudodifferential operators of various orders), and obtains approximate identities such as

\displaystyle  (a_1 a_2)(X,D) \approx a_1(X,D) a_2(X,D)

where the error between the left and right-hand sides is of “lower order” and can in fact enjoys a useful asymptotic expansion. As a first approximation to this calculus, one can think of functions {f \in {\mathcal S}({\bf R})} as having some sort of “phase space portrait{\tilde f(x,\xi)} which somehow combines the physical space representation {x \mapsto f(x)} with its Fourier representation {\xi \mapsto f(\xi)}, and pseudodifferential operators {a(X,D)} behave approximately like “phase space multiplier operators” in this representation in the sense that

\displaystyle  \widetilde{a(X,D) f}(x,\xi) \approx a(x,\xi) \tilde f(x,\xi).

Unfortunately the uncertainty principle (or the non-commutativity of {X} and {D}) prevents us from making these approximations perfectly precise, and it is not always clear how to even define a phase space portrait {\tilde f} of a function {f} precisely (although there are certain popular candidates for such a portrait, such as the FBI transform (also known as the Gabor transform in signal processing literature), or the Wigner quasiprobability distribution, each of which have some advantages and disadvantages). Nevertheless even if the concept of a phase space portrait is somewhat fuzzy, it is of great conceptual benefit both within mathematics and outside of it. For instance, the musical score one assigns a piece of music can be viewed as a phase space portrait of the sound waves generated by that music.

To complement the pseudodifferential calculus we have the basic Calderón-Vaillancourt theorem, which asserts that pseudodifferential operators of order zero are Calderón-Zygmund operators and thus bounded on {L^p({\bf R})} for {1 < p < \infty}. The standard proof of this theorem is a classic application of one of the basic techniques in harmonic analysis, namely the exploitation of almost orthogonality; the proof we will give here will achieve this through the elegant device of the Cotlar-Stein lemma.

Pseudodifferential operators (especially when generalised to higher dimensions {d \geq 1}) are a fundamental tool in the theory of linear PDE, as well as related fields such as semiclassical analysis, microlocal analysis, and geometric quantisation. There is an even wider class of operators that is also of interest, namely the Fourier integral operators, which roughly speaking not only approximately multiply the phase space portrait {\tilde f(x,\xi)} of a function by some multiplier {a(x,\xi)}, but also move the portrait around by a canonical transformation. However, the development of theory of these operators is beyond the scope of these notes; see for instance the texts of Hormander or Eskin.

This set of notes is only the briefest introduction to the theory of pseudodifferential operators. Many texts are available that cover the theory in more detail, for instance this text of Taylor.

Read the rest of this entry »

The fundamental notions of calculus, namely differentiation and integration, are often viewed as being the quintessential concepts in mathematical analysis, as their standard definitions involve the concept of a limit. However, it is possible to capture most of the essence of these notions by purely algebraic means (almost completely avoiding the use of limits, Riemann sums, and similar devices), which turns out to be useful when trying to generalise these concepts to more abstract situations in which it becomes convenient to permit the underlying number systems involved to be something other than the real or complex numbers, even if this makes many standard analysis constructions unavailable. For instance, the algebraic notion of a derivation often serves as a substitute for the analytic notion of a derivative in such cases, by abstracting out the key algebraic properties of differentiation, namely linearity and the Leibniz rule (also known as the product rule).

Abstract algebraic analogues of integration are less well known, but can still be developed. To motivate such an abstraction, consider the integration functional {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} from the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of complex-valued Schwarz functions {f: {\bf R} \rightarrow {\bf C}} to the complex numbers, defined by

\displaystyle I(f) := \int_{\bf R} f(x)\ dx

where the integration on the right is the usual Lebesgue integral (or improper Riemann integral) from analysis. This functional obeys two obvious algebraic properties. Firstly, it is linear over {{\bf C}}, thus

\displaystyle I(cf) = c I(f) \ \ \ \ \ (1)

and

\displaystyle I(f+g) = I(f) + I(g) \ \ \ \ \ (2)

for all {f,g \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and {c \in {\bf C}}. Secondly, it is translation invariant, thus

\displaystyle I(\tau_h f) = I(f) \ \ \ \ \ (3)

for all {h \in {\bf C}}, where {\tau_h f(x) := f(x-h)} is the translation of {f} by {h}. Motivated by the uniqueness theory of Haar measure, one might expect that these two axioms already uniquely determine {I} after one sets a normalisation, for instance by requiring that

\displaystyle I( x \mapsto e^{-\pi x^2} ) = 1. \ \ \ \ \ (4)

This is not quite true as stated (one can modify the proof of the Hahn-Banach theorem, after first applying a Fourier transform, to create pathological translation-invariant linear functionals on {{\mathcal S}({\bf R} \rightarrow {\bf C})} that are not multiples of the standard Fourier transform), but if one adds a mild analytical axiom, such as continuity of {I} (using the usual Schwartz topology on {{\mathcal S}({\bf R} \rightarrow {\bf C})}), then the above axioms are enough to uniquely pin down the notion of integration. Indeed, if {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} is a continuous linear functional that is translation invariant, then from the linearity and translation invariance axioms one has

\displaystyle I( \frac{\tau_h f - f}{h} ) = 0

for all {f \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and non-zero reals {h}. If {f} is Schwartz, then as {h \rightarrow 0}, one can verify that the Newton quotients {\frac{\tau_h f - f}{h}} converge in the Schwartz topology to the derivative {f'} of {f}, so by the continuity axiom one has

\displaystyle I(f') = 0.

Next, note that any Schwartz function of integral zero has an antiderivative which is also Schwartz, and so {I} annihilates all zero-integral Schwartz functions, and thus must be a scalar multiple of the usual integration functional. Using the normalisation (4), we see that {I} must therefore be the usual integration functional, giving the claimed uniqueness.

Motivated by the above discussion, we can define the notion of an abstract integration functional {I: X \rightarrow R} taking values in some vector space {R}, and applied to inputs {f} in some other vector space {X} that enjoys a linear action {h \mapsto \tau_h} (the “translation action”) of some group {V}, as being a functional which is both linear and translation invariant, thus one has the axioms (1), (2), (3) for all {f,g \in X}, scalars {c}, and {h \in V}. The previous discussion then considered the special case when {R = {\bf C}}, {X = {\mathcal S}({\bf R} \rightarrow {\bf C})}, {V = {\bf R}}, and {\tau} was the usual translation action.

Once we have performed this abstraction, we can now present analogues of classical integration which bear very little analytic resemblance to the classical concept, but which still have much of the algebraic structure of integration. Consider for instance the situation in which we keep the complex range {R = {\bf C}}, the translation group {V = {\bf R}}, and the usual translation action {h \mapsto \tau_h}, but we replace the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of Schwartz functions by the space {Poly_{\leq d}({\bf R} \rightarrow {\bf C})} of polynomials {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} of degree at most {d} with complex coefficients, where {d} is a fixed natural number; note that this space is translation invariant, so it makes sense to talk about an abstract integration functional {I: Poly_{\leq d}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}}. Of course, one cannot apply traditional integration concepts to non-zero polynomials, as they are not absolutely integrable. But one can repeat the previous arguments to show that any abstract integration functional must annihilate derivatives of polynomials of degree at most {d}:

\displaystyle I(f') = 0 \hbox{ for all } f \in Poly_{\leq d}({\bf R} \rightarrow {\bf C}). \ \ \ \ \ (5)

Clearly, every polynomial of degree at most {d-1} is thus annihilated by {I}, which makes {I} a scalar multiple of the functional that extracts the top coefficient {a_d} of a polynomial, thus if one sets a normalisation

\displaystyle I( x \mapsto x^d ) = c

for some constant {c}, then one has

\displaystyle I( x \mapsto a_0 + a_1 x + \ldots + a_d x^d ) = c a_d \ \ \ \ \ (6)

for any polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d}. So we see that up to a normalising constant, the operation of extracting the top order coefficient of a polynomial of fixed degree serves as the analogue of integration. In particular, despite the fact that integration is supposed to be the “opposite” of differentiation (as indicated for instance by (5)), we see in this case that integration is basically ({d}-fold) differentiation; indeed, compare (6) with the identity

\displaystyle (\frac{d}{dx})^d ( a_0 + a_1 x + \ldots + a_d x^d ) = d! a_d.

In particular, we see, in contrast to the usual Lebesgue integral, the integration functional (6) can be localised to an arbitrary location: one only needs to know the germ of the polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} at a single point {x_0} in order to determine the value of the functional (6). This localisation property may initially seem at odds with the translation invariance, but the two can be reconciled thanks to the extremely rigid nature of the class {Poly_{\leq d}({\bf R} \rightarrow {\bf C})}, in contrast to the Schwartz class {{\mathcal S}({\bf R} \rightarrow {\bf C})} which admits bump functions and so can generate local phenomena that can only be detected in small regions of the underlying spatial domain, and which therefore forces any translation-invariant integration functional on such function classes to measure the function at every single point in space.

The reversal of the relationship between integration and differentiation is also reflected in the fact that the abstract integration operation on polynomials interacts with the scaling operation {\delta_\lambda f(x) := f(x/\lambda)} in essentially the opposite way from the classical integration operation. Indeed, for classical integration on {{\bf R}^d}, one has

\displaystyle \int_{{\bf R}^d} f(x/\lambda)\ dx = \lambda^d \int f(x)\ dx

for Schwartz functions {f \in {\mathcal S}({\bf R}^d \rightarrow {\bf C})}, and so in this case the integration functional {I(f) := \int_{{\bf R}^d} f(x)\ dx} obeys the scaling law

\displaystyle I( \delta_\lambda f ) = \lambda^d I(f).

In contrast, the abstract integration operation defined in (6) obeys the opposite scaling law

\displaystyle I( \delta_\lambda f ) = \lambda^{-d} I(f). \ \ \ \ \ (7)

Remark 1 One way to interpret what is going on is to view the integration operation (6) as a renormalised version of integration. A polynomial {x \mapsto a_0 + a_1 + \ldots + a_d x^d} is, in general, not absolutely integrable, and the partial integrals

\displaystyle \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx

diverge as {N \rightarrow \infty}. But if one renormalises these integrals by the factor {\frac{1}{N^{d+1}}}, then one recovers convergence,

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N^{d+1}} \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx = \frac{1}{d+1} a_d

thus giving an interpretation of (6) as a renormalised classical integral, with the renormalisation being responsible for the unusual scaling relationship in (7). However, this interpretation is a little artificial, and it seems that it is best to view functionals such as (6) from an abstract algebraic perspective, rather than to try to force an analytic interpretation on them.

Now we return to the classical Lebesgue integral

\displaystyle I(f) := \int_{\bf R} f(x)\ dx. \ \ \ \ \ (8)

As noted earlier, this integration functional has a translation invariance associated to translations along the real line {{\bf R}}, as well as a dilation invariance by real dilation parameters {\lambda>0}. However, if we refine the class {{\mathcal S}({\bf R} \rightarrow {\bf C})} of functions somewhat, we can obtain a stronger family of invariances, in which we allow complex translations and dilations. More precisely, let {\mathcal{SE}({\bf C} \rightarrow {\bf C})} denote the space of all functions {f: {\bf C} \rightarrow {\bf C}} which are entire (or equivalently, are given by a Taylor series with an infinite radius of convergence around the origin) and also admit rapid decay in a sectorial neighbourhood of the real line, or more precisely there exists an {\epsilon>0} such that for every {A > 0} there exists {C_A > 0} such that one has the bound

\displaystyle |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z)| \leq A + \epsilon |\hbox{Re}(z)|}. For want of a better name, we shall call elements of this space Schwartz entire functions. This is clearly a complex vector space. A typical example of a Schwartz entire function are the complex gaussians

\displaystyle f(z) := e^{-\pi (az^2 + 2bz + c)}

where {a,b,c} are complex numbers with {\hbox{Re}(a) > 0}. From the Cauchy integral formula (and its derivatives) we see that if {f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, then the restriction of {f} to the real line lies in {{\mathcal S}({\bf R} \rightarrow {\bf C})}; conversely, from analytic continuation we see that every function in {{\mathcal S}({\bf R} \rightarrow {\bf C})} has at most one extension in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}. Thus one can identify {\mathcal{SE}({\bf C} \rightarrow {\bf C})} with a subspace of {{\mathcal S}({\bf R} \rightarrow {\bf C})}, and in particular the integration functional (8) is inherited by {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, and by abuse of notation we denote the resulting functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} as {I} also. Note, in analogy with the situation with polynomials, that this abstract integration functional is somewhat localised; one only needs to evaluate the function {f} on the real line, rather than the entire complex plane, in order to compute {I(f)}. This is consistent with the rigid nature of Schwartz entire functions, as one can uniquely recover the entire function from its values on the real line by analytic continuation.

Of course, the functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} remains translation invariant with respect to real translation:

\displaystyle I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf R}.

However, thanks to contour shifting, we now also have translation invariance with respect to complex translation:

\displaystyle I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf C},

where of course we continue to define the translation operator {\tau_h} for complex {h} by the usual formula {\tau_h f(x) := f(x-h)}. In a similar vein, we also have the scaling law

\displaystyle I(\delta_\lambda f) = \lambda I(f)

for any {f \in \mathcal{SE}({\bf C} \rightarrow {\bf C})}, if {\lambda} is a complex number sufficiently close to {1} (where “sufficiently close” depends on {f}, and more precisely depends on the sectoral aperture parameter {\epsilon} associated to {f}); again, one can verify that {\delta_\lambda f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})} for {\lambda} sufficiently close to {1}. These invariances (which relocalise the integration functional {I} onto other contours than the real line {{\bf R}}) are very useful for computing integrals, and in particular for computing gaussian integrals. For instance, the complex translation invariance tells us (after shifting by {b/a}) that

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi a z^2} )

when {a,b,c \in {\bf C}} with {\hbox{Re}(a) > 0}, and then an application of the complex scaling law (and a continuity argument, observing that there is a compact path connecting {a} to {1} in the right half plane) gives

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi z^2} )

using the branch of {a^{-1/2}} on the right half-plane for which {1^{-1/2} = 1}. Using the normalisation (4) we thus have

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)}

giving the usual gaussian integral formula

\displaystyle \int_{\bf R} e^{-\pi (ax^2 + 2bx + c)}\ dx = a^{-1/2} e^{-\pi (c-b^2/a)}. \ \ \ \ \ (9)

This is a basic illustration of the power that a large symmetry group (in this case, the complex homothety group) can bring to bear on the task of computing integrals.

One can extend this sort of analysis to higher dimensions. For any natural number {n \geq 1}, let {\mathcal{SE}({\bf C}^n \rightarrow {\bf C})} denote the space of all functions {f: {\bf C}^n \rightarrow {\bf C}} which is jointly entire in the sense that {f(z_1,\ldots,z_n)} can be expressed as a Taylor series in {z_1,\ldots,z_n} which is absolutely convergent for all choices of {z_1,\ldots,z_n}, and such that there exists an {\epsilon > 0} such that for any {A>0} there is {C_A>0} for which one has the bound

\displaystyle |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z_j)| \leq A + \epsilon |\hbox{Re}(z_j)|} for all {1 \leq j \leq n}, where {z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}} and {|z| := (|z_1|^2+\ldots+|z_n|^2)^{1/2}}. Again, we call such functions Schwartz entire functions; a typical example is the function

\displaystyle f(z) := e^{-\pi (z^T A z + 2b^T z + c)}

where {A} is an {n \times n} complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number. We can then define an abstract integration functional {I: \mathcal{SE}({\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by integration on the real slice {{\bf R}^n}:

\displaystyle I(f) := \int_{{\bf R}^n} f(x)\ dx

where {dx} is the usual Lebesgue measure on {{\bf R}^n}. By contour shifting in each of the {n} variables {z_1,\ldots,z_n} separately, we see that {I} is invariant with respect to complex translations of each of the {z_j} variables, and is thus invariant under translating the joint variable {z} by {{\bf C}^n}. One can also verify the scaling law

\displaystyle I(\delta_A f) = \hbox{det}(A) I(f)

for {n \times n} complex matrices {A} sufficiently close to the origin, where {\delta_A f(z) := f(A^{-1} z)}. This can be seen for shear transformations {A} by Fubini’s theorem and the aforementioned translation invariance, while for diagonal transformations near the origin this can be seen from {n} applications of one-dimensional scaling law, and the general case then follows by composition. Among other things, these laws then easily lead to the higher-dimensional generalisation

\displaystyle \int_{{\bf R}^n} e^{-\pi (x^T A x + 2 b^T x + c)}\ dx = \hbox{det}(A)^{-1/2} e^{-\pi (c-b^T A^{-1} b)} \ \ \ \ \ (10)

whenever {A} is a complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number, basically by repeating the one-dimensional argument sketched earlier. Here, we choose the branch of {\hbox{det}(A)^{-1/2}} for all matrices {A} in the indicated class for which {\hbox{det}(1)^{-1/2} = 1}.

Now we turn to an integration functional suitable for computing complex gaussian integrals such as

\displaystyle \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z}, \ \ \ \ \ (11)

where {z} is now a complex variable

\displaystyle z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix},

{z^\dagger} is the adjoint

\displaystyle z^\dagger := (\overline{z_1},\ldots, \overline{z_n}),

{A} is a complex {n \times n} matrix with positive definite Hermitian part, {b, \tilde b} are column vectors in {{\bf C}^n}, {c} is a complex number, and {dz d\overline{z} = \prod_{j=1}^n 2 d\hbox{Re}(z_j) d\hbox{Im}(z_j)} is {2^n} times Lebesgue measure on {{\bf C}^n}. (The factors of two here turn out to be a natural normalisation, but they can be ignored on a first reading.) As we shall see later, such integrals are relevant when performing computations on the Gaussian Unitary Ensemble (GUE) in random matrix theory. Note that the integrand here is not complex analytic due to the presence of the complex conjugates. However, this can be dealt with by the trick of replacing the complex conjugate {\overline{z}} by a variable {z^*} which is formally conjugate to {z}, but which is allowed to vary independently of {z}. More precisely, let {\mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C})} be the space of all functions {f: (z,z^*) \mapsto f(z,z^*)} of two independent {n}-tuples

\displaystyle z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}, z^* = \begin{pmatrix} z_1^* \\ \vdots \\ z_n^* \end{pmatrix}

of complex variables, which is jointly entire in all {2n} variables (in the sense defined previously, i.e. there is a joint Taylor series that is absolutely convergent for all independent choices of {z, z^* \in {\bf C}^n}), and such that there is an {\epsilon>0} such that for every {A>0} there is {C_A>0} such that one has the bound

\displaystyle |f(z,z^*)| \leq C_A (1 + |z|)^{-A}

whenever {|z^* - \overline{z}| \leq A + \epsilon |z|}. We will call such functions Schwartz analytic. Note that the integrand in (11) is Schwartz analytic when {A} has positive definite Hermitian part, if we reinterpret {z^\dagger} as the transpose of {z^*} rather than as the adjoint of {z} in order to make the integrand entire in {z} and {z^*}. We can then define an abstract integration functional {I: \mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by the formula

\displaystyle I(f) := \int_{{\bf C}^n} f(z,\overline{z})\ dz d\overline{z}, \ \ \ \ \ (12)

thus {I} can be localised to the slice {\{ (z,\overline{z}): z \in {\bf C}^n\}} of {{\bf C}^n \times {\bf C}^n} (though, as with previous functionals, one can use contour shifting to relocalise {I} to other slices also.) One can also write this integral as

\displaystyle I(f) = 2^n \int_{{\bf R}^n \times {\bf R}^n} f(x+iy, x-iy)\ dx dy

and note that the integrand here is a Schwartz entire function on {{\bf C}^n \times {\bf C}^n}, thus linking the Schwartz analytic integral with the Schwartz entire integral. Using this connection, one can verify that this functional {I} is invariant with respect to translating {z} and {z^*} by independent shifts in {{\bf C}^n} (thus giving a {{\bf C}^n \times {\bf C}^n} translation symmetry), and one also has the independent dilation symmetry

\displaystyle I(\delta_{A,B} f) = \hbox{det}(A) \hbox{det}(B) I(f)

for {n \times n} complex matrices {A,B} that are sufficiently close to the identity, where {\delta_{A,B} f(z,z^*) := f(A^{-1} z, B^{-1} z^*)}. Arguing as before, we can then compute (11) as

\displaystyle \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z} = \hbox{det}(A)^{-1} e^{-2\pi (c - b^\dagger A^{-1} \tilde b)}. \ \ \ \ \ (13)

In particular, this gives an integral representation for the determinant-reciprocal {\hbox{det}(A)^{-1}} of a complex {n \times n} matrix with positive definite Hermitian part, in terms of gaussian expressions in which {A} only appears linearly in the exponential:

\displaystyle \hbox{det}(A)^{-1} = \int_{{\bf C}^n} e^{-2\pi z^\dagger A z}\ dz d\overline{z}.

This formula is then convenient for computing statistics such as

\displaystyle \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1}

for random matrices {W_n} drawn from the Gaussian Unitary Ensemble (GUE), and some choice of spectral parameter {E+i\eta} with {\eta>0}; we review this computation later in this post. By the trick of matrix differentiation of the determinant (as reviewed in this recent blog post), one can also use this method to compute matrix-valued statistics such as

\displaystyle \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1} (W_n-E-i\eta)^{-1}.

However, if one restricts attention to classical integrals over real or complex (and in particular, commuting or bosonic) variables, it does not seem possible to easily eradicate the negative determinant factors in such calculations, which is unfortunate because many statistics of interest in random matrix theory, such as the expected Stieltjes transform

\displaystyle \mathop{\bf E} \frac{1}{n} \hbox{tr} (W_n-E-i\eta)^{-1},

which is the Stieltjes transform of the density of states. However, it turns out (as I learned recently from Peter Sarnak and Tom Spencer) that it is possible to cancel out these negative determinant factors by balancing the bosonic gaussian integrals with an equal number of fermionic gaussian integrals, in which one integrates over a family of anticommuting variables. These fermionic integrals are closer in spirit to the polynomial integral (6) than to Lebesgue type integrals, and in particular obey a scaling law which is inverse to the Lebesgue scaling (in particular, a linear change of fermionic variables {\zeta \mapsto A \zeta} ends up transforming a fermionic integral by {\hbox{det}(A)} rather than {\hbox{det}(A)^{-1}}), which conveniently cancels out the reciprocal determinants in the previous calculations. Furthermore, one can combine the bosonic and fermionic integrals into a unified integration concept, known as the Berezin integral (or Grassmann integral), in which one integrates functions of supervectors (vectors with both bosonic and fermionic components), and is of particular importance in the theory of supersymmetry in physics. (The prefix “super” in physics means, roughly speaking, that the object or concept that the prefix is attached to contains both bosonic and fermionic aspects.) When one applies this unified integration concept to gaussians, this can lead to quite compact and efficient calculations (provided that one is willing to work with “super”-analogues of various concepts in classical linear algebra, such as the supertrace or superdeterminant).

Abstract integrals of the flavour of (6) arose in quantum field theory, when physicists sought to formally compute integrals of the form

\displaystyle \int F( x_1, \ldots, x_n, \xi_1, \ldots, \xi_m )\ dx_1 \ldots dx_n d\xi_1 \ldots d\xi_m \ \ \ \ \ (14)

where {x_1,\ldots,x_n} are familiar commuting (or bosonic) variables (which, in particular, can often be localised to be scalar variables taking values in {{\bf R}} or {{\bf C}}), while {\xi_1,\ldots,\xi_m} were more exotic anticommuting (or fermionic) variables, taking values in some vector space of fermions. (As we shall see shortly, one can formalise these concepts by working in a supercommutative algebra.) The integrand {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} was a formally analytic function of {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}, in that it could be expanded as a (formal, noncommutative) power series in the variables {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}. For functions {F(x_1,\ldots,x_n)} that depend only on bosonic variables, it is certainly possible for such analytic functions to be in the Schwartz class and thus fall under the scope of the classical integral, as discussed previously. However, functions {F(\xi_1,\ldots,\xi_m)} that depend on fermionic variables {\xi_1,\ldots,\xi_m} behave rather differently. Indeed, a fermonic variable {\xi} must anticommute with itself, so that {\xi^2 = 0}. In particular, any power series in {\xi} terminates after the linear term in {\xi}, so that a function {F(\xi)} can only be analytic in {\xi} if it is a polynomial of degree at most {1} in {\xi}; more generally, an analytic function {F(\xi_1,\ldots,\xi_m)} of {m} fermionic variables {\xi_1,\ldots,\xi_m} must be a polynomial of degree at most {m}, and an analytic function {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} of {n} bosonic and {m} fermionic variables can be Schwartz in the bosonic variables but will be polynomial in the fermonic variables. As such, to interpret the integral (14), one can use classical (Lebesgue) integration (or the variants discussed above for integrating Schwartz entire or Schwartz analytic functions) for the bosonic variables, but must use abstract integrals such as (6) for the fermonic variables, leading to the concept of Berezin integration mentioned earlier.

In this post I would like to set out some of the basic algebraic formalism of Berezin integration, particularly with regards to integration of gaussian-type expressions, and then show how this formalism can be used to perform computations involving GUE (for instance, one can compute the density of states of GUE by this machinery without recourse to the theory of orthogonal polynomials). The use of supersymmetric gaussian integrals to analyse ensembles such as GUE appears in the work of Efetov (and was also proposed in the slightly earlier works of Parisi-Sourlas and McKane, with a related approach also appearing in the work of Wegner); the material here is adapted from this survey of Mirlin, as well as the later papers of Disertori-Pinson-Spencer and of Disertori.

Read the rest of this entry »

One of the basic problems in the field of operator algebras is to develop a functional calculus for either a single operator {A}, or a collection {A_1, A_2, \ldots, A_k} of operators. These operators could in principle act on any function space, but typically one either considers complex matrices (which act on a complex finite dimensional space), or operators (either bounded or unbounded) on a complex Hilbert space. (One can of course also obtain analogous results for real operators, but we will work throughout with complex operators in this post.)

Roughly speaking, a functional calculus is a way to assign an operator {F(A)} or {F(A_1,\ldots,A_k)} to any function {F} in a suitable function space, which is linear over the complex numbers, preserve the scalars (i.e. {c(A) = c} when {c \in {\bf C}}), and should be either an exact or approximate homomorphism in the sense that

\displaystyle  FG(A_1,\ldots,A_k) = F(A_1,\ldots,A_k) G(A_1,\ldots,A_k), \ \ \ \ \ (1)

should hold either exactly or approximately. In the case when the {A_i} are self-adjoint operators acting on a Hilbert space (or Hermitian matrices), one often also desires the identity

\displaystyle  \overline{F}(A_1,\ldots,A_k) = F(A_1,\ldots,A_k)^* \ \ \ \ \ (2)

to also hold either exactly or approximately. (Note that one cannot reasonably expect (1) and (2) to hold exactly for all {F,G} if the {A_1,\ldots,A_k} and their adjoints {A_1^*,\ldots,A_k^*} do not commute with each other, so in those cases one has to be willing to allow some error terms in the above wish list of properties of the calculus.) Ideally, one should also be able to relate the operator norm of {f(A)} or {f(A_1,\ldots,A_k)} with something like the uniform norm on {f}. In principle, the existence of a good functional calculus allows one to manipulate operators as if they were scalars (or at least approximately as if they were scalars), which is very helpful for a number of applications, such as partial differential equations, spectral theory, noncommutative probability, and semiclassical mechanics. A functional calculus for multiple operators {A_1,\ldots,A_k} can be particularly valuable as it allows one to treat {A_1,\ldots,A_k} as being exact or approximate scalars simultaneously. For instance, if one is trying to solve a linear differential equation that can (formally at least) be expressed in the form

\displaystyle  F(X,D) u = f

for some data {f}, unknown function {u}, some differential operators {X,D}, and some nice function {F}, then if one’s functional calculus is good enough (and {F} is suitably “elliptic” in the sense that it does not vanish or otherwise degenerate too often), one should be able to solve this equation either exactly or approximately by the formula

\displaystyle  u = F^{-1}(X,D) f,

which is of course how one would solve this equation if one pretended that the operators {X,D} were in fact scalars. Formalising this calculus rigorously leads to the theory of pseudodifferential operators, which allows one to (approximately) solve or at least simplify a much wider range of differential equations than one what can achieve with more elementary algebraic transformations (e.g. integrating factors, change of variables, variation of parameters, etc.). In quantum mechanics, a functional calculus that allows one to treat operators as if they were approximately scalar can be used to rigorously justify the correspondence principle in physics, namely that the predictions of quantum mechanics approximate that of classical mechanics in the semiclassical limit {\hbar \rightarrow 0}.

There is no universal functional calculus that works in all situations; the strongest functional calculi, which are close to being an exact *-homomorphisms on very large class of functions, tend to only work for under very restrictive hypotheses on {A} or {A_1,\ldots,A_k} (in particular, when {k > 1}, one needs the {A_1,\ldots,A_k} to commute either exactly, or very close to exactly), while there are weaker functional calculi which have fewer nice properties and only work for a very small class of functions, but can be applied to quite general operators {A} or {A_1,\ldots,A_k}. In some cases the functional calculus is only formal, in the sense that {f(A)} or {f(A_1,\ldots,A_k)} has to be interpreted as an infinite formal series that does not converge in a traditional sense. Also, when one wishes to select a functional calculus on non-commuting operators {A_1,\ldots,A_k}, there is a certain amount of non-uniqueness: one generally has a number of slightly different functional calculi to choose from, which generally have the same properties but differ in some minor technical details (particularly with regards to the behaviour of “lower order” components of the calculus). This is a similar to how one has a variety of slightly different coordinate systems available to parameterise a Riemannian manifold or Lie group. This is on contrast to the {k=1} case when the underlying operator {A = A_1} is (essentially) normal (so that {A} commutes with {A^*}); in this special case (which includes the important subcases when {A} is unitary or (essentially) self-adjoint), spectral theory gives us a canonical and very powerful functional calculus which can be used without further modification in applications.

Despite this lack of uniqueness, there is one standard choice for a functional calculus available for general operators {A_1,\ldots,A_k}, namely the Weyl functional calculus; it is analogous in some ways to normal coordinates for Riemannian manifolds, or exponential coordinates of the first kind for Lie groups, in that it treats lower order terms in a reasonably nice fashion. (But it is important to keep in mind that, like its analogues in Riemannian geometry or Lie theory, there will be some instances in which the Weyl calculus is not the optimal calculus to use for the application at hand.)

I decided to write some notes on the Weyl functional calculus (also known as Weyl quantisation), and to sketch the applications of this calculus both to the theory of pseudodifferential operators. They are mostly for my own benefit (so that I won’t have to redo these particular calculations again), but perhaps they will also be of interest to some readers here. (Of course, this material is also covered in many other places. e.g. Folland’s “harmonic analysis in phase space“.)

Read the rest of this entry »

Given a set {S}, a (simple) point process is a random subset {A} of {S}. (A non-simple point process would allow multiplicity; more formally, {A} is no longer a subset of {S}, but is a Radon measure on {S}, where we give {S} the structure of a locally compact Polish space, but I do not wish to dwell on these sorts of technical issues here.) Typically, {A} will be finite or countable, even when {S} is uncountable. Basic examples of point processes include

  • (Bernoulli point process) {S} is an at most countable set, {0 \leq p \leq 1} is a parameter, and {A} a random set such that the events {x \in A} for each {x \in S} are jointly independent and occur with a probability of {p} each. This process is automatically simple.
  • (Discrete Poisson point process) {S} is an at most countable space, {\lambda} is a measure on {S} (i.e. an assignment of a non-negative number {\lambda(\{x\})} to each {x \in S}), and {A} is a multiset where the multiplicity of {x} in {A} is a Poisson random variable with intensity {\lambda(\{x\})}, and the multiplicities of {x \in A} as {x} varies in {S} are jointly independent. This process is usually not simple.
  • (Continuous Poisson point process) {S} is a locally compact Polish space with a Radon measure {\lambda}, and for each {\Omega \subset S} of finite measure, the number of points {|A \cap \Omega|} that {A} contains inside {\Omega} is a Poisson random variable with intensity {\lambda(\Omega)}. Furthermore, if {\Omega_1,\ldots,\Omega_n} are disjoint sets, then the random variables {|A \cap \Omega_1|, \ldots, |A \cap \Omega_n|} are jointly independent. (The fact that Poisson processes exist at all requires a non-trivial amount of measure theory, and will not be discussed here.) This process is almost surely simple iff all points in {S} have measure zero.
  • (Spectral point processes) The spectrum of a random matrix is a point process in {{\mathbb C}} (or in {{\mathbb R}}, if the random matrix is Hermitian). If the spectrum is almost surely simple, then the point process is almost surely simple. In a similar spirit, the zeroes of a random polynomial are also a point process.

A remarkable fact is that many natural (simple) point processes are determinantal processes. Very roughly speaking, this means that there exists a positive semi-definite kernel {K: S \times S \rightarrow {\mathbb R}} such that, for any {x_1,\ldots,x_n \in S}, the probability that {x_1,\ldots,x_n} all lie in the random set {A} is proportional to the determinant {\det( (K(x_i,x_j))_{1 \leq i,j \leq n} )}. Examples of processes known to be determinantal include non-intersecting random walks, spectra of random matrix ensembles such as GUE, and zeroes of polynomials with gaussian coefficients.

I would be interested in finding a good explanation (even at the heuristic level) as to why determinantal processes are so prevalent in practice. I do have a very weak explanation, namely that determinantal processes obey a large number of rather pretty algebraic identities, and so it is plausible that any other process which has a very algebraic structure (in particular, any process involving gaussians, characteristic polynomials, etc.) would be connected in some way with determinantal processes. I’m not particularly satisfied with this explanation, but I thought I would at least describe some of these identities below to support this case. (This is partly for my own benefit, as I am trying to learn about these processes, particularly in connection with the spectral distribution of random matrices.) The material here is partly based on this survey of Hough, Krishnapur, Peres, and Virág.

Read the rest of this entry »

On Thursday, UCLA hosted a “Fields Medalist Symposium“, in which four of the six University of California-affiliated Fields Medalists (Vaughan Jones (1990), Efim Zelmanov (1994), Richard Borcherds (1998), and myself (2006)) gave talks of varying levels of technical sophistication. (The other two are Michael Freedman (1986) and Steven Smale (1966), who could not attend.) The slides for my own talks are available here.

The talks were in order of the year in which the medal was awarded: we began with Vaughan, who spoke on “Flatland: a great place to do algebra”, then Efim, who spoke on “Pro-finite groups”, Richard, who spoke on “What is a quantum field theory?”, and myself, on “Nilsequences and the primes.” The audience was quite mixed, ranging from mathematics faculty to undergraduates to alumni to curiosity seekers, and I severely doubt that every audience member understood every talk, but there was something for everyone, and for me personally it was fantastic to see some perspectives from first-class mathematicians on some wonderful areas of mathematics outside of my own fields of expertise.

Disclaimer: the summaries below are reconstructed from my notes and from some hasty web research; I don’t vouch for 100% accuracy of the mathematical content, and would welcome corrections.

Read the rest of this entry »

This problem lies in the highly interconnected interface between algebraic combinatorics (esp. the combinatorics of Young tableaux and related objects, including honeycombs and puzzles), algebraic geometry (particularly classical and quantum intersection theory and geometric invariant theory), linear algebra (additive and multiplicative, real and tropical), and the representation theory (classical, quantum, crystal, etc.) of classical groups. (Another open problem in this subject is to find a succinct and descriptive name for the field.) I myself haven’t actively worked in this area for several years, but I still find it a fascinating and beautiful subject. (With respect to the dichotomy between structure and randomness, this subject lies deep within the “structure” end of the spectrum.)

As mentioned above, the problems in this area can be approached from a variety of quite diverse perspectives, but here I will focus on the linear algebra perspective, which is perhaps the most accessible. About nine years ago, Allen Knutson and I introduced a combinatorial gadget, called a honeycomb, which among other things controlled the relationship between the eigenvalues of two arbitrary Hermitian matrices A, B, and the eigenvalues of their sum A+B; this was not the first such gadget that achieved this purpose, but it was a particularly convenient one for studying this problem, in particular it was used to resolve two conjectures in the subject, the saturation conjecture and the Horn conjecture. (These conjectures have since been proven by a variety of other methods.) There is a natural multiplicative version of these problems, which now relates the eigenvalues of two arbitrary unitary matrices U, V and the eigenvalues of their product UV; this led to the “quantum saturation” and “quantum Horn” conjectures, which were proven a couple years ago. However, the quantum analogue of a “honeycomb” remains a mystery; this is the main topic of the current post.

Read the rest of this entry »

Archives