The fundamental notions of calculus, namely differentiation and integration, are often viewed as being the quintessential concepts in mathematical analysis, as their standard definitions involve the concept of a limit. However, it is possible to capture most of the essence of these notions by purely algebraic means (almost completely avoiding the use of limits, Riemann sums, and similar devices), which turns out to be useful when trying to generalise these concepts to more abstract situations in which it becomes convenient to permit the underlying number systems involved to be something other than the real or complex numbers, even if this makes many standard analysis constructions unavailable. For instance, the algebraic notion of a derivation often serves as a substitute for the analytic notion of a derivative in such cases, by abstracting out the key algebraic properties of differentiation, namely linearity and the Leibniz rule (also known as the product rule).

Abstract algebraic analogues of integration are less well known, but can still be developed. To motivate such an abstraction, consider the integration functional {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} from the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of complex-valued Schwarz functions {f: {\bf R} \rightarrow {\bf C}} to the complex numbers, defined by

\displaystyle I(f) := \int_{\bf R} f(x)\ dx

where the integration on the right is the usual Lebesgue integral (or improper Riemann integral) from analysis. This functional obeys two obvious algebraic properties. Firstly, it is linear over {{\bf C}}, thus

\displaystyle I(cf) = c I(f) \ \ \ \ \ (1)

and

\displaystyle I(f+g) = I(f) + I(g) \ \ \ \ \ (2)

for all {f,g \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and {c \in {\bf C}}. Secondly, it is translation invariant, thus

\displaystyle I(\tau_h f) = I(f) \ \ \ \ \ (3)

for all {h \in {\bf C}}, where {\tau_h f(x) := f(x-h)} is the translation of {f} by {h}. Motivated by the uniqueness theory of Haar measure, one might expect that these two axioms already uniquely determine {I} after one sets a normalisation, for instance by requiring that

\displaystyle I( x \mapsto e^{-\pi x^2} ) = 1. \ \ \ \ \ (4)

This is not quite true as stated (one can modify the proof of the Hahn-Banach theorem, after first applying a Fourier transform, to create pathological translation-invariant linear functionals on {{\mathcal S}({\bf R} \rightarrow {\bf C})} that are not multiples of the standard Fourier transform), but if one adds a mild analytical axiom, such as continuity of {I} (using the usual Schwartz topology on {{\mathcal S}({\bf R} \rightarrow {\bf C})}), then the above axioms are enough to uniquely pin down the notion of integration. Indeed, if {I: {\mathcal S}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}} is a continuous linear functional that is translation invariant, then from the linearity and translation invariance axioms one has

\displaystyle I( \frac{\tau_h f - f}{h} ) = 0

for all {f \in {\mathcal S}({\bf R} \rightarrow {\bf C})} and non-zero reals {h}. If {f} is Schwartz, then as {h \rightarrow 0}, one can verify that the Newton quotients {\frac{\tau_h f - f}{h}} converge in the Schwartz topology to the derivative {f'} of {f}, so by the continuity axiom one has

\displaystyle I(f') = 0.

Next, note that any Schwartz function of integral zero has an antiderivative which is also Schwartz, and so {I} annihilates all zero-integral Schwartz functions, and thus must be a scalar multiple of the usual integration functional. Using the normalisation (4), we see that {I} must therefore be the usual integration functional, giving the claimed uniqueness.

Motivated by the above discussion, we can define the notion of an abstract integration functional {I: X \rightarrow R} taking values in some vector space {R}, and applied to inputs {f} in some other vector space {X} that enjoys a linear action {h \mapsto \tau_h} (the “translation action”) of some group {V}, as being a functional which is both linear and translation invariant, thus one has the axioms (1), (2), (3) for all {f,g \in X}, scalars {c}, and {h \in V}. The previous discussion then considered the special case when {R = {\bf C}}, {X = {\mathcal S}({\bf R} \rightarrow {\bf C})}, {V = {\bf R}}, and {\tau} was the usual translation action.

Once we have performed this abstraction, we can now present analogues of classical integration which bear very little analytic resemblance to the classical concept, but which still have much of the algebraic structure of integration. Consider for instance the situation in which we keep the complex range {R = {\bf C}}, the translation group {V = {\bf R}}, and the usual translation action {h \mapsto \tau_h}, but we replace the space {{\mathcal S}({\bf R} \rightarrow {\bf C})} of Schwartz functions by the space {Poly_{\leq d}({\bf R} \rightarrow {\bf C})} of polynomials {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} of degree at most {d} with complex coefficients, where {d} is a fixed natural number; note that this space is translation invariant, so it makes sense to talk about an abstract integration functional {I: Poly_{\leq d}({\bf R} \rightarrow {\bf C}) \rightarrow {\bf C}}. Of course, one cannot apply traditional integration concepts to non-zero polynomials, as they are not absolutely integrable. But one can repeat the previous arguments to show that any abstract integration functional must annihilate derivatives of polynomials of degree at most {d}:

\displaystyle I(f') = 0 \hbox{ for all } f \in Poly_{\leq d}({\bf R} \rightarrow {\bf C}). \ \ \ \ \ (5)

Clearly, every polynomial of degree at most {d-1} is thus annihilated by {I}, which makes {I} a scalar multiple of the functional that extracts the top coefficient {a_d} of a polynomial, thus if one sets a normalisation

\displaystyle I( x \mapsto x^d ) = c

for some constant {c}, then one has

\displaystyle I( x \mapsto a_0 + a_1 x + \ldots + a_d x^d ) = c a_d \ \ \ \ \ (6)

for any polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d}. So we see that up to a normalising constant, the operation of extracting the top order coefficient of a polynomial of fixed degree serves as the analogue of integration. In particular, despite the fact that integration is supposed to be the “opposite” of differentiation (as indicated for instance by (5)), we see in this case that integration is basically ({d}-fold) differentiation; indeed, compare (6) with the identity

\displaystyle (\frac{d}{dx})^d ( a_0 + a_1 x + \ldots + a_d x^d ) = d! a_d.

In particular, we see, in contrast to the usual Lebesgue integral, the integration functional (6) can be localised to an arbitrary location: one only needs to know the germ of the polynomial {x \mapsto a_0 + a_1 x + \ldots + a_d x^d} at a single point {x_0} in order to determine the value of the functional (6). This localisation property may initially seem at odds with the translation invariance, but the two can be reconciled thanks to the extremely rigid nature of the class {Poly_{\leq d}({\bf R} \rightarrow {\bf C})}, in contrast to the Schwartz class {{\mathcal S}({\bf R} \rightarrow {\bf C})} which admits bump functions and so can generate local phenomena that can only be detected in small regions of the underlying spatial domain, and which therefore forces any translation-invariant integration functional on such function classes to measure the function at every single point in space.

The reversal of the relationship between integration and differentiation is also reflected in the fact that the abstract integration operation on polynomials interacts with the scaling operation {\delta_\lambda f(x) := f(x/\lambda)} in essentially the opposite way from the classical integration operation. Indeed, for classical integration on {{\bf R}^d}, one has

\displaystyle \int_{{\bf R}^d} f(x/\lambda)\ dx = \lambda^d \int f(x)\ dx

for Schwartz functions {f \in {\mathcal S}({\bf R}^d \rightarrow {\bf C})}, and so in this case the integration functional {I(f) := \int_{{\bf R}^d} f(x)\ dx} obeys the scaling law

\displaystyle I( \delta_\lambda f ) = \lambda^d I(f).

In contrast, the abstract integration operation defined in (6) obeys the opposite scaling law

\displaystyle I( \delta_\lambda f ) = \lambda^{-d} I(f). \ \ \ \ \ (7)

Remark 1 One way to interpret what is going on is to view the integration operation (6) as a renormalised version of integration. A polynomial {x \mapsto a_0 + a_1 + \ldots + a_d x^d} is, in general, not absolutely integrable, and the partial integrals

\displaystyle \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx

diverge as {N \rightarrow \infty}. But if one renormalises these integrals by the factor {\frac{1}{N^{d+1}}}, then one recovers convergence,

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N^{d+1}} \int_0^N a_0 + a_1 + \ldots + a_d x^d\ dx = \frac{1}{d+1} a_d

thus giving an interpretation of (6) as a renormalised classical integral, with the renormalisation being responsible for the unusual scaling relationship in (7). However, this interpretation is a little artificial, and it seems that it is best to view functionals such as (6) from an abstract algebraic perspective, rather than to try to force an analytic interpretation on them.

Now we return to the classical Lebesgue integral

\displaystyle I(f) := \int_{\bf R} f(x)\ dx. \ \ \ \ \ (8)

As noted earlier, this integration functional has a translation invariance associated to translations along the real line {{\bf R}}, as well as a dilation invariance by real dilation parameters {\lambda>0}. However, if we refine the class {{\mathcal S}({\bf R} \rightarrow {\bf C})} of functions somewhat, we can obtain a stronger family of invariances, in which we allow complex translations and dilations. More precisely, let {\mathcal{SE}({\bf C} \rightarrow {\bf C})} denote the space of all functions {f: {\bf C} \rightarrow {\bf C}} which are entire (or equivalently, are given by a Taylor series with an infinite radius of convergence around the origin) and also admit rapid decay in a sectorial neighbourhood of the real line, or more precisely there exists an {\epsilon>0} such that for every {A > 0} there exists {C_A > 0} such that one has the bound

\displaystyle |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z)| \leq A + \epsilon |\hbox{Re}(z)|}. For want of a better name, we shall call elements of this space Schwartz entire functions. This is clearly a complex vector space. A typical example of a Schwartz entire function are the complex gaussians

\displaystyle f(z) := e^{-\pi (az^2 + 2bz + c)}

where {a,b,c} are complex numbers with {\hbox{Re}(a) > 0}. From the Cauchy integral formula (and its derivatives) we see that if {f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, then the restriction of {f} to the real line lies in {{\mathcal S}({\bf R} \rightarrow {\bf C})}; conversely, from analytic continuation we see that every function in {{\mathcal S}({\bf R} \rightarrow {\bf C})} has at most one extension in {\mathcal{SE}({\bf C} \rightarrow {\bf C})}. Thus one can identify {\mathcal{SE}({\bf C} \rightarrow {\bf C})} with a subspace of {{\mathcal S}({\bf R} \rightarrow {\bf C})}, and in particular the integration functional (8) is inherited by {\mathcal{SE}({\bf C} \rightarrow {\bf C})}, and by abuse of notation we denote the resulting functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} as {I} also. Note, in analogy with the situation with polynomials, that this abstract integration functional is somewhat localised; one only needs to evaluate the function {f} on the real line, rather than the entire complex plane, in order to compute {I(f)}. This is consistent with the rigid nature of Schwartz entire functions, as one can uniquely recover the entire function from its values on the real line by analytic continuation.

Of course, the functional {I: \mathcal{SE}({\bf C} \rightarrow {\bf C}) \rightarrow {\bf C}} remains translation invariant with respect to real translation:

\displaystyle I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf R}.

However, thanks to contour shifting, we now also have translation invariance with respect to complex translation:

\displaystyle I(\tau_h f) = I(f) \hbox{ for all } h \in {\bf C},

where of course we continue to define the translation operator {\tau_h} for complex {h} by the usual formula {\tau_h f(x) := f(x-h)}. In a similar vein, we also have the scaling law

\displaystyle I(\delta_\lambda f) = \lambda I(f)

for any {f \in \mathcal{SE}({\bf C} \rightarrow {\bf C})}, if {\lambda} is a complex number sufficiently close to {1} (where “sufficiently close” depends on {f}, and more precisely depends on the sectoral aperture parameter {\epsilon} associated to {f}); again, one can verify that {\delta_\lambda f} lies in {\mathcal{SE}({\bf C} \rightarrow {\bf C})} for {\lambda} sufficiently close to {1}. These invariances (which relocalise the integration functional {I} onto other contours than the real line {{\bf R}}) are very useful for computing integrals, and in particular for computing gaussian integrals. For instance, the complex translation invariance tells us (after shifting by {b/a}) that

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi a z^2} )

when {a,b,c \in {\bf C}} with {\hbox{Re}(a) > 0}, and then an application of the complex scaling law (and a continuity argument, observing that there is a compact path connecting {a} to {1} in the right half plane) gives

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)} I( z \mapsto e^{-\pi z^2} )

using the branch of {a^{-1/2}} on the right half-plane for which {1^{-1/2} = 1}. Using the normalisation (4) we thus have

\displaystyle I( z \mapsto e^{-\pi (az^2 + 2bz + c) } ) = a^{-1/2} e^{-\pi (c-b^2/a)}

giving the usual gaussian integral formula

\displaystyle \int_{\bf R} e^{-\pi (ax^2 + 2bx + c)}\ dx = a^{-1/2} e^{-\pi (c-b^2/a)}. \ \ \ \ \ (9)

This is a basic illustration of the power that a large symmetry group (in this case, the complex homothety group) can bring to bear on the task of computing integrals.

One can extend this sort of analysis to higher dimensions. For any natural number {n \geq 1}, let {\mathcal{SE}({\bf C}^n \rightarrow {\bf C})} denote the space of all functions {f: {\bf C}^n \rightarrow {\bf C}} which is jointly entire in the sense that {f(z_1,\ldots,z_n)} can be expressed as a Taylor series in {z_1,\ldots,z_n} which is absolutely convergent for all choices of {z_1,\ldots,z_n}, and such that there exists an {\epsilon > 0} such that for any {A>0} there is {C_A>0} for which one has the bound

\displaystyle |f(z)| \leq C_A (1+|z|)^{-A}

whenever {|\hbox{Im}(z_j)| \leq A + \epsilon |\hbox{Re}(z_j)|} for all {1 \leq j \leq n}, where {z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}} and {|z| := (|z_1|^2+\ldots+|z_n|^2)^{1/2}}. Again, we call such functions Schwartz entire functions; a typical example is the function

\displaystyle f(z) := e^{-\pi (z^T A z + 2b^T z + c)}

where {A} is an {n \times n} complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number. We can then define an abstract integration functional {I: \mathcal{SE}({\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by integration on the real slice {{\bf R}^n}:

\displaystyle I(f) := \int_{{\bf R}^n} f(x)\ dx

where {dx} is the usual Lebesgue measure on {{\bf R}^n}. By contour shifting in each of the {n} variables {z_1,\ldots,z_n} separately, we see that {I} is invariant with respect to complex translations of each of the {z_j} variables, and is thus invariant under translating the joint variable {z} by {{\bf C}^n}. One can also verify the scaling law

\displaystyle I(\delta_A f) = \hbox{det}(A) I(f)

for {n \times n} complex matrices {A} sufficiently close to the origin, where {\delta_A f(z) := f(A^{-1} z)}. This can be seen for shear transformations {A} by Fubini’s theorem and the aforementioned translation invariance, while for diagonal transformations near the origin this can be seen from {n} applications of one-dimensional scaling law, and the general case then follows by composition. Among other things, these laws then easily lead to the higher-dimensional generalisation

\displaystyle \int_{{\bf R}^n} e^{-\pi (x^T A x + 2 b^T x + c)}\ dx = \hbox{det}(A)^{-1/2} e^{-\pi (c-b^T A^{-1} b)} \ \ \ \ \ (10)

whenever {A} is a complex symmetric matrix with positive definite real part, {b} is a vector in {{\bf C}^n}, and {c} is a complex number, basically by repeating the one-dimensional argument sketched earlier. Here, we choose the branch of {\hbox{det}(A)^{-1/2}} for all matrices {A} in the indicated class for which {\hbox{det}(1)^{-1/2} = 1}.

Now we turn to an integration functional suitable for computing complex gaussian integrals such as

\displaystyle \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z}, \ \ \ \ \ (11)

where {z} is now a complex variable

\displaystyle z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix},

{z^\dagger} is the adjoint

\displaystyle z^\dagger := (\overline{z_1},\ldots, \overline{z_n}),

{A} is a complex {n \times n} matrix with positive definite Hermitian part, {b, \tilde b} are column vectors in {{\bf C}^n}, {c} is a complex number, and {dz d\overline{z} = \prod_{j=1}^n 2 d\hbox{Re}(z_j) d\hbox{Im}(z_j)} is {2^n} times Lebesgue measure on {{\bf C}^n}. (The factors of two here turn out to be a natural normalisation, but they can be ignored on a first reading.) As we shall see later, such integrals are relevant when performing computations on the Gaussian Unitary Ensemble (GUE) in random matrix theory. Note that the integrand here is not complex analytic due to the presence of the complex conjugates. However, this can be dealt with by the trick of replacing the complex conjugate {\overline{z}} by a variable {z^*} which is formally conjugate to {z}, but which is allowed to vary independently of {z}. More precisely, let {\mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C})} be the space of all functions {f: (z,z^*) \mapsto f(z,z^*)} of two independent {n}-tuples

\displaystyle z = \begin{pmatrix} z_1 \\ \vdots \\ z_n \end{pmatrix}, z^* = \begin{pmatrix} z_1^* \\ \vdots \\ z_n^* \end{pmatrix}

of complex variables, which is jointly entire in all {2n} variables (in the sense defined previously, i.e. there is a joint Taylor series that is absolutely convergent for all independent choices of {z, z^* \in {\bf C}^n}), and such that there is an {\epsilon>0} such that for every {A>0} there is {C_A>0} such that one has the bound

\displaystyle |f(z,z^*)| \leq C_A (1 + |z|)^{-A}

whenever {|z^* - \overline{z}| \leq A + \epsilon |z|}. We will call such functions Schwartz analytic. Note that the integrand in (11) is Schwartz analytic when {A} has positive definite Hermitian part, if we reinterpret {z^\dagger} as the transpose of {z^*} rather than as the adjoint of {z} in order to make the integrand entire in {z} and {z^*}. We can then define an abstract integration functional {I: \mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C}) \rightarrow {\bf C}} by the formula

\displaystyle I(f) := \int_{{\bf C}^n} f(z,\overline{z})\ dz d\overline{z}, \ \ \ \ \ (12)

thus {I} can be localised to the slice {\{ (z,\overline{z}): z \in {\bf C}^n\}} of {{\bf C}^n \times {\bf C}^n} (though, as with previous functionals, one can use contour shifting to relocalise {I} to other slices also.) One can also write this integral as

\displaystyle I(f) = 2^n \int_{{\bf R}^n \times {\bf R}^n} f(x+iy, x-iy)\ dx dy

and note that the integrand here is a Schwartz entire function on {{\bf C}^n \times {\bf C}^n}, thus linking the Schwartz analytic integral with the Schwartz entire integral. Using this connection, one can verify that this functional {I} is invariant with respect to translating {z} and {z^*} by independent shifts in {{\bf C}^n} (thus giving a {{\bf C}^n \times {\bf C}^n} translation symmetry), and one also has the independent dilation symmetry

\displaystyle I(\delta_{A,B} f) = \hbox{det}(A) \hbox{det}(B) I(f)

for {n \times n} complex matrices {A,B} that are sufficiently close to the identity, where {\delta_{A,B} f(z,z^*) := f(A^{-1} z, B^{-1} z^*)}. Arguing as before, we can then compute (11) as

\displaystyle \int_{{\bf C}^n} e^{-2\pi (z^\dagger A z + b^\dagger z + z^\dagger \tilde b + c)}\ dz d\overline{z} = \hbox{det}(A)^{-1} e^{-2\pi (c - b^\dagger A^{-1} \tilde b)}. \ \ \ \ \ (13)

In particular, this gives an integral representation for the determinant-reciprocal {\hbox{det}(A)^{-1}} of a complex {n \times n} matrix with positive definite Hermitian part, in terms of gaussian expressions in which {A} only appears linearly in the exponential:

\displaystyle \hbox{det}(A)^{-1} = \int_{{\bf C}^n} e^{-2\pi z^\dagger A z}\ dz d\overline{z}.

This formula is then convenient for computing statistics such as

\displaystyle \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1}

for random matrices {W_n} drawn from the Gaussian Unitary Ensemble (GUE), and some choice of spectral parameter {E+i\eta} with {\eta>0}; we review this computation later in this post. By the trick of matrix differentiation of the determinant (as reviewed in this recent blog post), one can also use this method to compute matrix-valued statistics such as

\displaystyle \mathop{\bf E} \hbox{det}(W_n-E-i\eta)^{-1} (W_n-E-i\eta)^{-1}.

However, if one restricts attention to classical integrals over real or complex (and in particular, commuting or bosonic) variables, it does not seem possible to easily eradicate the negative determinant factors in such calculations, which is unfortunate because many statistics of interest in random matrix theory, such as the expected Stieltjes transform

\displaystyle \mathop{\bf E} \frac{1}{n} \hbox{tr} (W_n-E-i\eta)^{-1},

which is the Stieltjes transform of the density of states. However, it turns out (as I learned recently from Peter Sarnak and Tom Spencer) that it is possible to cancel out these negative determinant factors by balancing the bosonic gaussian integrals with an equal number of fermionic gaussian integrals, in which one integrates over a family of anticommuting variables. These fermionic integrals are closer in spirit to the polynomial integral (6) than to Lebesgue type integrals, and in particular obey a scaling law which is inverse to the Lebesgue scaling (in particular, a linear change of fermionic variables {\zeta \mapsto A \zeta} ends up transforming a fermionic integral by {\hbox{det}(A)} rather than {\hbox{det}(A)^{-1}}), which conveniently cancels out the reciprocal determinants in the previous calculations. Furthermore, one can combine the bosonic and fermionic integrals into a unified integration concept, known as the Berezin integral (or Grassmann integral), in which one integrates functions of supervectors (vectors with both bosonic and fermionic components), and is of particular importance in the theory of supersymmetry in physics. (The prefix “super” in physics means, roughly speaking, that the object or concept that the prefix is attached to contains both bosonic and fermionic aspects.) When one applies this unified integration concept to gaussians, this can lead to quite compact and efficient calculations (provided that one is willing to work with “super”-analogues of various concepts in classical linear algebra, such as the supertrace or superdeterminant).

Abstract integrals of the flavour of (6) arose in quantum field theory, when physicists sought to formally compute integrals of the form

\displaystyle \int F( x_1, \ldots, x_n, \xi_1, \ldots, \xi_m )\ dx_1 \ldots dx_n d\xi_1 \ldots d\xi_m \ \ \ \ \ (14)

where {x_1,\ldots,x_n} are familiar commuting (or bosonic) variables (which, in particular, can often be localised to be scalar variables taking values in {{\bf R}} or {{\bf C}}), while {\xi_1,\ldots,\xi_m} were more exotic anticommuting (or fermionic) variables, taking values in some vector space of fermions. (As we shall see shortly, one can formalise these concepts by working in a supercommutative algebra.) The integrand {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} was a formally analytic function of {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}, in that it could be expanded as a (formal, noncommutative) power series in the variables {x_1,\ldots,x_n,\xi_1,\ldots,\xi_m}. For functions {F(x_1,\ldots,x_n)} that depend only on bosonic variables, it is certainly possible for such analytic functions to be in the Schwartz class and thus fall under the scope of the classical integral, as discussed previously. However, functions {F(\xi_1,\ldots,\xi_m)} that depend on fermionic variables {\xi_1,\ldots,\xi_m} behave rather differently. Indeed, a fermonic variable {\xi} must anticommute with itself, so that {\xi^2 = 0}. In particular, any power series in {\xi} terminates after the linear term in {\xi}, so that a function {F(\xi)} can only be analytic in {\xi} if it is a polynomial of degree at most {1} in {\xi}; more generally, an analytic function {F(\xi_1,\ldots,\xi_m)} of {m} fermionic variables {\xi_1,\ldots,\xi_m} must be a polynomial of degree at most {m}, and an analytic function {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)} of {n} bosonic and {m} fermionic variables can be Schwartz in the bosonic variables but will be polynomial in the fermonic variables. As such, to interpret the integral (14), one can use classical (Lebesgue) integration (or the variants discussed above for integrating Schwartz entire or Schwartz analytic functions) for the bosonic variables, but must use abstract integrals such as (6) for the fermonic variables, leading to the concept of Berezin integration mentioned earlier.

In this post I would like to set out some of the basic algebraic formalism of Berezin integration, particularly with regards to integration of gaussian-type expressions, and then show how this formalism can be used to perform computations involving GUE (for instance, one can compute the density of states of GUE by this machinery without recourse to the theory of orthogonal polynomials). The use of supersymmetric gaussian integrals to analyse ensembles such as GUE appears in the work of Efetov (and was also proposed in the slightly earlier works of Parisi-Sourlas and McKane, with a related approach also appearing in the work of Wegner); the material here is adapted from this survey of Mirlin, as well as the later papers of Disertori-Pinson-Spencer and of Disertori.

— 1. Grassmann algebra and Berezin integration —

Berezin integration can be performed on functions defined on the vectors in any supercommutative algebra, or even more generally on a supermanifold, but for the purposes of the applications to random matrix theory discussed here, we will only need to understand Berezin integration for analytic functions {F: \bigwedge(V)_b^n \times \bigwedge(V)_f^m \rightarrow \bigwedge(V)} of {n} bosonic variables and {m} fermionic variables.

We now set up the formal mathematical framework. We will need a space {V} of basic fermions, which can be taken to be any infinite-dimensional abstract complex vector space. The infinite dimensionality of {V} is convenient to avoid certain degeneracies; it may seem dangerous from an analysis perspective to integrate over such spaces, but as we will be performing integration from a purely algebraic viewpoint, this will not be a concern. (Indeed, one could avoid dealing with the individual elements of space {V} altogether, and work instead with certain rings of functions on {V} (thus treating {V} as a noncommutative scheme, rather than as a set of points), but we will not adopt this viewpoint here.)

We then form the {k}-fold exterior powers {\bigwedge^k(V)}, which is the universal complex vector space generated by the {k}-fold wedge products {\xi_1 \wedge \ldots \wedge \xi_k} of elements {\xi_1,\ldots,\xi_k} of {V}, subject to the requirement that the wedge product {\wedge} is bilinear, and also antisymmetric on elements of {V}. We then form the exterior algebra {\bigwedge(V) = \bigoplus_{k=0}^\infty \bigwedge^k(V)} of {V} as the direct sum of all these exterior powers. If one endows this algebra with the wedge product {\wedge}, one obtains a complex algebra, since the wedge product is bilinear and associative. By abuse of notation, we will write the wedge product {\Phi \wedge \Psi} simply as {\Phi \Psi}.

We split {\bigwedge(V) = \bigwedge(V)_b + \bigwedge(V)_f} into the space {\bigwedge(V)_b := \bigoplus_{k=0}^\infty \bigwedge^{2k}(V)} of bosons (arising from exterior powers of even order) and the space {\bigwedge(V)_b := \bigoplus_{k=0}^\infty \bigwedge^{2k+1}(V)} of fermions (exterior powers of odd order). Thus, for instance, complex scalars (which make up {\bigwedge^0(V)}) are bosons, while elements of {V} are fermions (i.e. basic fermions are fermions). We observe that the product of two bosons or two fermions is a boson, while the product of a boson and a fermion is a fermion, which gives {\bigwedge(V)} the structure of a superalgebra (i.e. {{\bf Z}_2}graded algebra, with {\bigwedge(V)_b} and {\bigwedge(V)_f} being the {0} and {1} graded components).

Generally speaking, we will try to use Roman symbols such as {x,z} to denote bosons, and Greek symbols such as {\xi,\zeta} to denote fermions; we will also try to use capital Greek symbols (such as {\Phi, \Psi}) to denote combinations of bosons and fermions.

It is easy to verify (as can be done for instance by using a basis {\epsilon_1,\ldots,\epsilon_n} for {V}, with the attendant basis {\epsilon_{i_1} \ldots \epsilon_{i_k}}, {i_1 < \ldots < i_k} for {\bigwedge^k(V)}), that bosonic elements of {\bigwedge(V)} are central (they commute with both bosons and fermions), while fermionic elements of {\bigwedge(V)} commute with bosonic elements but anticommute with each other. (In other words, the superalgebra {\bigwedge(V)} is supercommutative.)

A fermionic element {\xi} will commute with all bosonic elements and anticommute with fermonic elements, which in particular implies that

\displaystyle \xi \bigwedge(V) = \bigwedge(V) \xi. \ \ \ \ \ (15)

One corollary of this (and the anticommutativity of {\xi} with itself) is that any product in {\bigwedge(V)} which contains two copies of {\xi} will necessarily vanish. Another corollary is that all elements {\Phi} in {\bigoplus_{k=1}^\infty \bigwedge^k(V)} are nilpotent, so that {\Phi^m=0} for some {m}. In particular, every element in {\bigwedge(V)} can be decomposed as the sum of a scalar and a nilpotent (in fact, this decomposition is unique). A further corollary is the fact the algebra {\bigwedge(V)} is locally finitely dimensional, in the sense that every finite collection of elements in {\bigwedge(V)} generates a finite dimensional subalgebra of {\bigwedge(V)}. Among other things, this implies that every element {\Phi} of {\bigwedge(V)} can be exponentiated by the usual power series

\displaystyle \exp(\Phi) := \sum_{k=0}^\infty \frac{\Phi^k}{k!}.

Thus, for instance, the exponential of a bosonic element is again a bosonic element, while the exponential of a fermion {\xi} is just a linear function, since {y} anticommutes with itself and thus squares to zero:

\displaystyle \exp(\xi) = 1+\xi.

As bosonic elements are central, we also see that we have the usual formula

\displaystyle \exp(x+\Phi) = \exp(x) \exp(\Phi) = \exp(\Phi) \exp(x)

whenever {x} is bosonic and {\Phi} is an arbitrary element of {\bigwedge(V)}.

We now consider functions {F: \bigwedge(V)_b^n \times \bigwedge(V)_f^m \rightarrow \bigwedge(V)} of {n} bosonic variables {x_1,\ldots,x_n \in \bigwedge(V)_b} and {m} fermionic variables {\xi_1,\ldots,\xi_m \in \bigwedge(V)_f}. We will abbreviate {\bigwedge(V)_b^n \times \bigwedge(V)_f^m} as {\bigwedge(V)^{(n|m)}}, and write

\displaystyle x := \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}; \quad \xi := \begin{pmatrix} \xi_1 \\ \vdots \\ \xi_m \end{pmatrix}; \quad \Phi := \begin{pmatrix} x \\ \xi \end{pmatrix}.

We will restrict attention to functions {F} which are strongly analytic in the sense that they can be written as a strongly convergent noncommutative Taylor series in the variables {\Phi} with coefficients in {\bigwedge(V)}. By strongly convergence, we mean that for any given choice of {\Phi \in \bigwedge(V)^{(n|m)}}, all of the terms in the Taylor series lie in a finite dimensional subspace of {\bigwedge(V)}, and the series is absolutely convergent in that finite dimensional subspace. (One could consider more relaxed notions of convergence (and thus of analyticity) here, but this strong notion of analyticity is already obeyed by the functions we will care about in applications, namely supercommutative gaussian functions with polynomial weights, so we will not need to consider more general classes of analytic functions here.)

Let {{\mathcal A}( \bigwedge(V)^{(n|m)} )} denote the space of strongly analytic functions from {\bigwedge(V)^{(n|m)}} to {\bigwedge(V)}. This is clearly a complex algebra, and contains all the polynomials in the variables {\Phi} with coefficients in {\bigwedge(V)}, as well as exponentials of such polynomials. It is also translation invariant in all of the variables {\Phi} (this is a variant of the basic fact in real analysis that if a Taylor series has infinite radius of convergence at the origin, then it is also equal to a Taylor series with infinite radius of sequence at any other point). On the other hand, by collecting terms in {\xi_i} for any {1 \leq i \leq m}, we see that any strongly analytic function {F} can be written in the form

\displaystyle F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m) = F^{(i)}_{\emptyset}(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},\xi_{i+1},\ldots,\xi_m) + F^{(i)}_i(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},\xi_{i+1},\ldots,\xi_m) \xi_i

for some strongly analytic functions {F^{(i)}_\emptyset, F^{(i)}_i: \bigwedge(V)^{(n|m-1)} \rightarrow \bigwedge(V)}. In fact, {F^{(i)}_\emptyset} and {F^{(i)}_i} are uniquely determined from {F}; {F^{(i)}_\emptyset(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{m-1})} is necessarily equal to {F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},0,\xi_{i+1},\ldots,\xi_m)}, and if {F^{(i)}_i} were not unique, then on subtraction one could find an element {y \in \bigwedge(V)} with the property that {y \xi_i= 0} for all {\xi_i \in \bigwedge(V)_f}, which is not possible because {V} is infinite dimensional.

We then define the (one-dimensional) Berezin integral

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi_i

of a strongly analytic function {F \in {\mathcal A}(\bigwedge(V)^{(n|m)})} with respect to the {\xi_i} variable by the formula

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi_i

\displaystyle := \frac{1}{\sqrt{2\pi}} F^{(i)}_i(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},\xi_{i+1},\ldots,\xi_m);

the normalisation factor {\frac{1}{\sqrt{2\pi}}} is convenient for gaussian integration calculations, as we shall see later, but can be ignored for now. This is a functional from {{\mathcal A}(\bigwedge(V)^{(n|m)})} to {{\mathcal A}(\bigwedge(V)^{(n|m-1)})}, which is an abstract integration functional in the sense discussed in the the introduction, because the functional is invariant with respect to translations of the {\xi_i} variable by elements of {\bigwedge(V)_f}. It also obeys the scaling law

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},\xi_i/\lambda, \xi_{i+1},\ldots,\xi_n)\ d\xi_i

\displaystyle = \lambda^{-1} \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_{i-1},\xi_i, \xi_{i+1},\ldots,\xi_n)\ d\xi_i

for any invertible bosonic element {\lambda \in \bigwedge(V)_b}, as follows immediately from the definitions.

We can iterate the above integration operation. For instance, any {F \in {\mathcal A}(\bigwedge(V)^{(n|m)})} can be fully decomposed in terms of the fermionic variables as

\displaystyle F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m) = \sum_{1 \leq i_1 < \ldots < i_k \leq m} F^{(1,\ldots,m)}_{i_1,\ldots,i_k}(x_1,\ldots,x_n) \xi_{i_1} \ldots \xi_{i_m} \ \ \ \ \ (16)

where {F_{i_1,\ldots,i_k}: \bigwedge(V)_b^n \rightarrow \bigwedge(V)} are strongly analytic functions of just the bosonic variables {x_1,\ldots,x_n}, and the sum ranges over tuples {1 \leq i_1 < \ldots< i_k \leq m}. We can then define the Berezin integral

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi

of a strongly analytic function {F \in {\mathcal A}(\bigwedge(V)^{(n|m)} )} over all the fermionic variables

\displaystyle \xi =\begin{pmatrix} \xi_1 \\ \vdots \\ \xi_m \end{pmatrix}

at once, by the formula

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi = \frac{1}{(2\pi)^{m/2}} F^{(1,\ldots,m)}_{1,\ldots,m}(x_1,\ldots,x_n).

This is an abstract integration functional from {{\mathcal A}(\bigwedge(V)^{(n|m)})} to {{\mathcal A}(\bigwedge(V)_b^n)} which is invariant under translations of {\xi} by {\bigwedge(V)_f^m}; it can also be viewed as the iteration of the one-dimensional integrations by the Fubini-type formula

\displaystyle \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi

\displaystyle = \int \ldots \int F(x_1,\ldots,x_n,\xi_1,\ldots,\xi_m)\ d\xi_m \ldots d\xi_1

(note the reversal of the order of integration here). Much as fermions themselves anticommute with each other, one-dimensional Berezin integrals over fermonic variables also anticommute with each other, thus for instance

\displaystyle \int\int F(\xi_1,\xi_2)\ d\xi_1 d\xi_2 = - \int\int F(\xi_1,\xi_2)\ d\xi_2 d\xi_1

(compare with integration of differential forms, with {d\xi_1 \wedge d\xi_2 = - d\xi_2 \wedge d\xi_1}). One also verifies the scaling law

\displaystyle \int F(x, A^{-1} \xi)\ d\xi = \hbox{det}(A)^{-1} \int F(x,\xi)\ d\xi

for any invertible {m \times m} matrix {A} with bosonic entries, which can be verified for instance by first checking it in the case of diagonal matrices, permutation matrices, and shear matrices, and then observing that these generate all the other invertible matrices.

We can combine integration over fermionic variables with the more familiar integration over bosonic variables. We will focus attention on complex bosonic and fermionic integration rather than real bosonic and fermionic integration, as this will be the integration concept that is relevant for computations involving GUE. Thus, we will now consider strongly analytic functions {F \in {\mathcal A}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)})} of {2n} bosonic variables {z_1,\ldots,z_n,z^*_1,\ldots,z^*_n \in \bigwedge(V)_b} and {2m} fermionic variables {\zeta_1,\ldots,\zeta_m,\zeta_1^*,\ldots,\zeta_m^* \in \bigwedge(V)_f}. As previously discussed in the integration of Schwartz analytic functions, we allow the {z^*_i} variable to vary independently of the {z_i} variable despite being formally being denoted as an adjoint to {z_i}, and similarly for {\zeta_j} and {\zeta_j^*}.

Observe that a strongly analytic function {F(z,z^*)} of purely bosonic variables will have all Taylor coefficients take values in a finite dimensional subspace {W} of {\bigwedge(V)} (otherwise it will not be strongly analytic for complex scalar non-zero {z_1,\ldots,z_n,z_1^*,\ldots,z_n^*}). In particular, if we restrict the bosonic variables {z_1,\ldots,z_n,z_1^*,\ldots,z_n^*} to be complex scalars, then {F(z_1,\ldots,z_n,z_1^*,\ldots,z_n^*)} takes values in this subspace {W} . We then say that {F} is Schwartz analytic if the restriction to {{\bf C}^n \times {\bf C}^n} lies in {\mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow W)}, thus every component of this restriction lies in {\mathcal{SA}({\bf C}^n \times {\bf C}^n \rightarrow {\bf C})}. Note that this restriction to {{\bf C}^n \times {\bf C}^n} is sufficient to recover the values of {F} at all other values in {\bigwedge(V)^n \times \bigwedge(V)^n}, because one can read off all the Taylor coefficients of {F} from this restriction. We denote the space of such Schwartz analytic functions as {\mathcal{SA}(\bigwedge(V)_b^n \rightarrow \bigwedge(V))}. We then use the functionals (12) to define Berezin integration on one or more pairs {z_i, z_i^*} of bosonic variables. For instance, the Berezin integral

\displaystyle \int F(z,z^*)\ dz_i^* dz_i

will, by definition, be the Lebesgue integral

\displaystyle \int_{\bf C} F(z_1,\ldots,z_{i-1},z_i,z_{i+1},\ldots,z_n,z_1^*,\ldots,z_{i-1}^*,\overline{z_i},z_{i+1}^*,\ldots,z_n^*)\ dz_i d\overline{z_i},

recalling that {dz_i d\overline{z_i}} is {2} times Lebesgue measure on the complex plane in the {z_i} variable, and similarly

\displaystyle \int F(z,z^*)\ dz^* dz

is the quantity

\displaystyle \int_{{\bf C}^n} F(z_1,\ldots,z_n,\overline{z_1},\ldots,\overline{z_n})\ \prod_{i=1}^n dz_i d\overline{z_i}.

One easily verifies that Berezin integration with respect to a single pair {z_i, z_i^*} of bosonic variables maps {\mathcal{SA}(\bigwedge(V)_b^n \times \bigwedge(V)_b^n \rightarrow \bigwedge(V))} to {\mathcal{SA}(\bigwedge(V)_b^{n-1} \times \bigwedge(V)_b^{n-1} \rightarrow \bigwedge(V))}, and integration with respect to all the bosonic variables {z, z^*} maps {\mathcal{SA}(\bigwedge(V)_b^n \times \bigwedge(V)_b^n \rightarrow \bigwedge(V))} to {\bigwedge(V)}.

As discussed in the introduction, a bosonic integral is invariant with respect to independent translations of the {z} and {z^*} by any complex shifts. It turns out that these integrals are in fact also invariant under independent translations of {z,z^*} by arbitrary bosonic shifts. For sake of notation we will just illustrate this in the {n=1} case. From the invariance under complex shifts we have

\displaystyle \int F(z + w,z^* + w^*)\ dz^* dz = \int F(z,z^*)\ dz dz^*

for any complex {w,w^* \in {\bf C}}. But both sides of this equation are entire in both variables {w,w^*}, so this identity must also hold on the level of (commutative) formal power series. Specialising {w,w^*} from formal variables to bosonic variables we obtain the claim. For similar reasons, we have the scaling law

\displaystyle \int F( A^{-1} z, B^{-1} z^*)\ dz dz^* = \hbox{det}(A) \hbox{det}(B) \int F(z,z^*)\ dz dz^*

for all invertible {n \times n} matrices {A, B} with bosonic entries and scalar part sufficiently close to the identity, because the claim was already shown to be true for complex entries, and both sides are analytic in {A,B}.

A function {F \in {\mathcal A}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} of {n} bosonic and {m} fermionic variables {z,\zeta} and their formal adjoints {z^*,\zeta^*} will be called Schwartz analytic if each of its components under the decomposition (16) is Schwartz analytic, and the space of such functions will be denoted {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))}. One can then perform Berezin integration with respect to a pair {z_i,z_i^*} of bosonic variables by integrating each term in (16) separately; this creates an integration functional from {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} to {\mathcal{SA}(\bigwedge(V)^{(n-1|m)} \times \bigwedge(V)^{(n-1|m)} \rightarrow \bigwedge(V))}. Similarly, one can integrate out all the bosonic variables at once, creating an integration functional from {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} to {\mathcal{SA}(\bigwedge(V)^{(0|m)} \times \bigwedge(V)^{(0|m)} \rightarrow \bigwedge(V))}. Meanwhile, fermionic integration in a pair {\zeta_i, \zeta_i^*} maps can be verified to map {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} to {\mathcal{SA}(\bigwedge(V)^{(n|m-1)} \times \bigwedge(V)^{(n|m-1)} \rightarrow \bigwedge(V))}, and integrating out all pairs at once leads to a functional from {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} to {\mathcal{SA}(\bigwedge(V)^{(n|0)} \times \bigwedge(V)^{(n|0)} \rightarrow \bigwedge(V))}. Finally, one can check that bosonic integration commutes with either fermionic and bosonic integration, and fermionic integration anticommutes with fermionic integration; in particular, integrating a pair of {\zeta_i, \zeta_i^*} is an operation that commutes with other such operations or with bosonic integration. Because of this, one can now define the full Berezin integral

\displaystyle \int F(\Phi,\Phi^*)\ d\Phi^* d\Phi = \int F(z,z^*,\zeta,\zeta^*)\ dz^* dz d\zeta^* d\zeta

of a Schwartz analytic function {F \in \mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} by integrating out all the pairs {z_i, z_i^*} and {\zeta_i, \zeta_i^*} (with the order in which these pairs are integrated being irrelevant). This gives an integration functional from {\mathcal{SA}(\bigwedge(V)^{(n|m)} \times \bigwedge(V)^{(n|m)} \rightarrow \bigwedge(V))} to {\bigwedge(V)}. From the translation invariance properties of the individual bosonic and fermonic integrals, we see that this functional is invariant with respect to independent translations of {\Phi} and {\Phi^*} by elements of {\bigwedge(V)^{(n|m)}}.

Example 1 Take {n=m=1}. If {a,b} are bosons with the real scalar part of {a} being positive, then the gaussian function

\displaystyle (z,\zeta) \mapsto e^{-2\pi (z^* a z + \zeta^* b \zeta)}

can be expanded (using the nilpotent nature of {\zeta^* b \zeta}) as

\displaystyle e^{-2\pi z^* a z} (1 - 2\pi \zeta^* b \zeta )

or equivalently

\displaystyle e^{-2\pi z^* a z} + 2\pi b e^{-2\pi i z^* a z} \zeta \zeta^*

and this is a Schwartz analytic function on {\bigwedge(V)^{(1|1)}}. Performing the bosonic integrals (using (13)) we then get

\displaystyle \int e^{-2\pi (z^* a z + \zeta^* b \zeta)}\ dz^* dz = a^{-1} + 2\pi a^{-1} b \zeta \zeta^*

and then on performing the fermionic integrals we obtain

\displaystyle \int e^{-2\pi (z^* a z + \zeta^* b \zeta)}\ dz^* dz d\zeta^* d\zeta = a^{-1} b.

If instead one performs the fermionic integral first, one obtains

\displaystyle \int e^{-2\pi (z^* a z + \zeta^* b \zeta)}\ d\zeta^* d\zeta = b e^{-2\pi i z^* a z},

and then on performing the bosonic integrals one ends up at the same place:

\displaystyle \int e^{-2\pi (z^* a z + \zeta^* b \zeta)}\ d\zeta^* d\zeta dz^* dz = a^{-1} b.

Note how the parameters {a} and {b} scale in the opposite way in this integral.

We now derive the general scaling law for Berezin integrals

\displaystyle \int F(\Phi,\Phi^*)\ d\Phi^* d\Phi

in which we scale {\Phi} by a matrix that suitably respects the bosonic and fermonic components of {\Phi}. More precisely, define an {(n|m) \times (n|m)} supermatrix to be a {n+m \times n+m} block matrix of the form

\displaystyle \Sigma = \begin{pmatrix} A & \sigma \\ \rho & B \end{pmatrix}

where {A = \Sigma_{bb}} is a {n \times n} matrix with bosonic entries, {\sigma = \Sigma_{bf}} is an {n \times m} matrix with fermionic entries, {\rho = \Sigma_{fb}} is an {m \times n} matrix with fermionic entries, and {B = \Sigma_{ff}} is an {m \times m} matrix with bosonic entries. Observe that if {\Phi} is an {(n|m)}-dimensional column supervector and {\Phi^\dagger} is an {(n|m)}-dimensional row supervector then

\displaystyle \Phi^\dagger \Sigma \Phi = z^\dagger A z + z^\dagger \sigma \zeta + \zeta^\dagger \rho z + \zeta^\dagger B \zeta.

Proposition 1 (Scaling law) Let {F \in \mathcal{SA}(\bigwedge(V)^{(n|m)} \rightarrow\bigwedge(V))} be a Schwartz analytic function, and let {\Sigma} be a {(n|m) \times (n|m)} matrix. If the scalar part of {\Sigma} is sufficiently close to the identity (or equivalently, the scalar parts of {A,B} are sufficiently close to the identity), then we have

\displaystyle \int F( \Sigma^{-1} \Phi, \Phi^* )\ d\Phi^* \Phi = \hbox{Sdet}(\Sigma) \int F( \Phi, \Phi^* )\ d\Phi^* \Phi

where {\hbox{Sdet}(\Sigma)} is the superdeterminant (also known as the Berezinian) of {\Sigma}, defined by the formula

\displaystyle \hbox{Sdet}(\Sigma) = \hbox{det}( A - \sigma B^{-1} \rho ) \hbox{det}(B)^{-1}

(in particular, this quantity is bosonic).

The formula for the superdeterminant should be compared with the Schur complement formula

\displaystyle \hbox{det} \begin{pmatrix} A & S \\ R & B \end{pmatrix} = \hbox{det}( A - S B^{-1} R ) \hbox{det}(B)

for ordinary {n+m \times n+m} block matrices (which was discussed in this previous blog post)

Proof: When {\Sigma} is a block-diagonal matrix (so that {\sigma} and {\rho} vanish, and the superdeterminant simplifies to {\hbox{det}(A) \hbox{det}(B)^{-1}}), the claim follows from the separate scaling laws for bosonic and fermonic integration obtained previously. When {\Sigma} is a shear matrix (so that {A} and {B} are the identity, and one of {\sigma,\rho} vanishes, and the superdeterminant simplifies to {1}) the claim follows from the translation invariance of either the fermonic or bosonic integral (after performing these two integrals in a suitable order). For the general case, we use the factorisation

\displaystyle \Sigma = \begin{pmatrix} 1 & \sigma B^{-1} \\ 0 & 1 \end{pmatrix} \begin{pmatrix} A-\sigma B^{-1} \rho & 0 \\ 0 & B \end{pmatrix} \begin{pmatrix} 1 & 0 \\ B^{-1} \rho & 1 \end{pmatrix},

noting that the two shear matrices have superdeterminant {1}, while the block-diagonal matrix has the same superdeterminant as {\Sigma}, to deduce the general case from the two special cases previously mentioned. \Box

One consequence of this scaling law (and the nontrivial nature of the Berezin integral) is that one has the mutliplication law

\displaystyle \hbox{Sdet}(\Sigma \Gamma) = \hbox{Sdet}(\Sigma) \hbox{Sdet}(\Gamma)

for any two supermatrices {\Sigma,\Gamma}, at least if their scalar parts are sufficiently close to the identity. This in turn implies that the superdeterminant is the multiplicative analogue of the supertrace

\displaystyle \hbox{Str}(\Sigma) := \hbox{tr}(A) - \hbox{tr}(B)

in the sense that

\displaystyle \hbox{Sdet}(\exp(\Sigma)) = \exp( \hbox{Str}(\Sigma))

for any supermatrix {\Sigma} (at least if its scalar part is sufficiently small). Note also that the supertrace obeys the expected cyclic property

\displaystyle \hbox{Str}(\Sigma \Gamma ) = \hbox{Str}(\Gamma \Sigma)

which can also be deduced from the previous identities by matrix differentiation, as indicated in this previous post.

By repeating the derivation of (13) (reducing to integrals that are basically higher dimensional versions of Example 1), one has the Grassmann gaussian integral formula

\displaystyle \int e^{-2\pi (\Phi^\dagger \Sigma \Phi + \Psi^\dagger \Phi + \Phi^\dagger \tilde \Psi + c)}\ d\Phi^* d\Phi = \hbox{Sdet}(\Sigma)^{-1} e^{-2\pi (c - \Psi^\dagger \Sigma^{-1} \tilde \Psi)} \ \ \ \ \ (17)

whenever {\Sigma} is an {(n|m) \times (n|m)} supermatrix whose bosonic part {\Sigma_{bb}} has positive definite scalar Hermitian part, {\Psi, \tilde \Psi} are {(n|m)}-dimensional supervectors, and {c \in {\bf C}}, with {\Phi^\dagger} being the transpose of {\Phi^*}. In particular, one has

\displaystyle \int e^{-2\pi \Phi^\dagger \Sigma \Phi}\ d\Phi^* d\Phi = \hbox{Sdet}(\Sigma)^{-1}. \ \ \ \ \ (18)

We can isolate the bosonic and fermionic special cases of this identity, namely

\displaystyle \int e^{-2\pi z^\dagger A z}\ dz^* dz = \hbox{det}(A)^{-1} \ \ \ \ \ (19)

and

\displaystyle \int e^{-2\pi \zeta^\dagger B \zeta}\ d\zeta^* d\zeta = \hbox{det}(B) \ \ \ \ \ (20)

whenever {A}, {B} are {n \times n} and {m \times m} matrices with bosonic entries respectively. For comparison, we also observe the real fermionic analogue of these identities, namely

\displaystyle \int e^{-\pi \xi^T A \xi}\ d\xi = \hbox{Pf}(A)

where the Berezin integral is now over {m} fermonic variables {\xi_1,\ldots,\xi_m}, and {A} is an {m \times m} antisymmetric bosonic matrix, with {\hbox{Pf}(A)} being the Pfaffian of {A}. This can be seen by directly Taylor expanding {e^{-\pi \xi^T A \xi}} and isolating the {\xi_1 \ldots \xi_m} term. One can then develop a theory of superpfaffians in analogy to that of superdeterminants, which among other things may be helpful for manipulating the Gaussian Orthogonal Ensemble (GOE) (or at least the skew-symmetric analogue of GOE), but we will not do so here.

As noted in this previous blog post, one can often start with an identity involving a determinant and apply matrix differentiation to obtain further useful identities. If we start with (18), replace {\Sigma} by an infinitesimal perturbation {\Sigma+\epsilon \Lambda} for an arbitrary {(n|m) \times (n|m)} supermatrix {\Lambda} matrix, and extract the linear component in {\Lambda}, one arrives at the identity

\displaystyle \int_{{\bf C}^n} e^{-2\pi \Phi^\dagger \Sigma \Phi} (-2\pi \Phi^\dagger \Lambda \Phi)\ d\Phi^* d\Phi = - \hbox{Sdet}(\Sigma)^{-1} \hbox{Str}(\Sigma^{-1} \Lambda).

In particular, if we set {\Lambda} to be the elementary matrix

\displaystyle \Lambda := \begin{pmatrix} e_k e_j^\dagger & 0 \\ 0 & 0 \end{pmatrix}

then we have

\displaystyle \frac{((\Sigma^{-1})_{bb})_{jk}}{\hbox{Sdet}(\Sigma)} = 2\pi \int e^{-2\pi \Phi^\dagger \Sigma \Phi} z^*_k z_j\ d\Phi^* d\Phi \ \ \ \ \ (21)

and in particular (if we have no fermionic elements)

\displaystyle \frac{((\Sigma^{-1})_{bb})_{jk}}{\hbox{det}(A)} = 2\pi \int_{{\bf C}^n} e^{-2\pi z^\dagger A z} z^*_k z_j\ dz d\overline{z} \ \ \ \ \ (22)

— 2. Application to GUE statistics —

We now use the above Gaussian integral identities to compute some GUE statistics. These statistics are initially rather complicated looking integrals over {O(n^2)} variables, but after some application of the above identities, we can cut the number of variables of integration down to {O(n)}, and by a further use of these gaussian identities we can reduce this number down to just {O(1)}, at which point it becomes feasible to obtain asympotics for such integrals by techniques such as the method of steepest descent (also known as the saddle point method).

To illustrate this general phenomenon, we begin with a simple example which only requires classical (or bosonic) integration.

Proposition 2 Let {W_n} be a GUE matrix, thus {W_n = \frac{1}{\sqrt{n}} (\xi_{ij})_{1 \leq i,j \leq n}} where {\xi_{ij}} has the standard real normal distribution {N(0,1)_{\bf R}} (i.e. density function {\frac{1}{\sqrt{2\pi}} e^{-x^2/2}\ dx}) when {i=j}, and the standard complex normal distribution {N(0,1)_{\bf C}} (i.e. density function {\frac{1}{\pi} e^{-|z|^2}\ dz}) when {i \neq j}, with {\xi_{ij}} being jointly independent for {i \geq j}. Let {E_\eta := E+i\eta} be a complex number with positive imaginary part {\eta}. Then

\displaystyle {\bf E} \frac{1}{\hbox{det}(W_n - E_\eta)} = i^n \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2 - \frac{2\pi^2}{n} |z|^4}\ dz d\overline{z}. \ \ \ \ \ (23)

Proof: From (19) (or (13)) applied to {i(W_n-E_\eta)} (which has Hermitian part {\eta}) we have

\displaystyle \frac{1}{\hbox{det}(W_n-E_\eta)} = i^n \int_{{\bf C}^n} e^{-2\pi i z^\dagger (W_n-E_\eta) z}\ dz d\overline{z}

and so by Fubini’s theorem (which can be easily justified in view of all the exponential decay in the integrand) we have

\displaystyle {\bf E} \frac{1}{\hbox{det}(W_n-E_\eta)} = i^n \int_{{\bf C}^n} {\bf E} e^{-2\pi i z^\dagger (W_n-E_\eta) z}\ dz d\overline{z}

\displaystyle = i^n \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2} {\bf E} e^{-2\pi i z^\dagger W_n z}\ dz d\overline{z}.

Now observe that the top left {e_1^\dagger W_n e_1} coordinate of {W_n} is {\frac{1}{\sqrt{n}} \xi_{11}}, and {\xi_{11}} has the standard normal distribution {\frac{1}{\sqrt{2\pi}} e^{-x^2/2}}. Thus we have

\displaystyle {\bf E} e^{-t e_1^\dagger W_n e_1} = \frac{1}{\sqrt{2\pi}} \int_{\bf R} e^{-tx/\sqrt{n}} e^{-x^2/2}\ dx

\displaystyle = e^{t^2/2n}

for any {t \in {\bf C}} with non-negative real part, thanks to (9). By the unitary invariance of the GUE ensemble {W_n}, we thus have

\displaystyle \mathop{\bf E} e^{-2\pi i z^\dagger W_n z} = e^{-2\pi^2 |z|^4/n} \ \ \ \ \ (24)

since we can use that invariance to reduce to the case {z=te_1}, and the claim follows. \Box

The right-hand side of (23) is simpler than the left-hand side, as the integration is only over {{\bf C}^n} (as opposed to the {n^2}-dimensional space of Hermitian matrices), and there are no determinant factors. The integral can be simplified further by the following trick (known to physicists as the Hubbard-Stratonovich transformation. As the gaussian {e^{-\pi x^2}} is its own Fourier transform, we have

\displaystyle e^{-\pi x^2} = \int_{\bf R} e^{-\pi t^2} e^{2\pi i xt}\ dt \ \ \ \ \ (25)

for any {x \in {\bf R}} (and also for {x \in {\bf C}}, by analytic continuation). The point here is that a quadratic exponential in {x} can be replaced with a combination of linear exponentials in {x}. Applying this identity with {x} replaced by {\frac{\sqrt{2\pi} |z|^2}{\sqrt{n}}}, we conclude that

\displaystyle e^{-\frac{2\pi^2}{n} |z|^4} = \int_{\bf R} e^{-\pi t^2} e^{2\pi i \frac{\sqrt{2\pi}}{n^{1/2}} t |z|^2}\ dt, \ \ \ \ \ (26)

thus replacing a quartic exponential by a combination of quadratic ones. By Fubini’s theorem, the right-hand side of (23) can be written

\displaystyle i^n \int_{\bf R} e^{-\pi t^2} \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2} e^{2\pi i \frac{\sqrt{2\pi}}{n^{1/2}} t |z|^2}\ dz d\overline{z} dt.

Applying (10) one has

\displaystyle \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2} e^{2\pi i \frac{\sqrt{2\pi}}{n^{1/2}} t |z|^2}\ dz d\overline{z} = \frac{1}{(-iE_\eta - i \sqrt{\frac{2\pi}{n}} t)^n}

and so (23) simplifies further to

\displaystyle {\bf E} \frac{1}{\hbox{det}(W_n - E_\eta)} = (-1)^n \int_{\bf R} (E_\eta + \sqrt{\frac{2\pi}{n}} t)^{-n} e^{-\pi t^2}\ dt.

We thus see that the expression {{\bf E} \frac{1}{\hbox{det}(W_n - E_\eta)}} has now been reduced to a one-dimensional integral, which can be estimated by a variety of techniques, such as the method of steepest descent (also known as the saddle point method).

The equation (22) allows one to manipulate components of an inverse {A^{-1}} of a matrix, so long as this component is weighted by a reciprocal determinant. For instance, it implies that

\displaystyle \frac{1}{\hbox{det}(W_n-E_\eta)} ((W_n-E_\eta)^{-1})_{jk} = - 2\pi i^{n+1} \int_{{\bf C}^n} e^{-2\pi i z^\dagger (W_n-E_\eta) z} z^*_k z_j\ dz d\overline{z} \ \ \ \ \ (27)

for any {1 \leq j,k \leq n}, and so by repeating the proof of Proposition 2 one has

\displaystyle {\bf E} \frac{1}{\hbox{det}(W_n - E_\eta)} ((W_n-E_\eta)^{-1})_{jk}

\displaystyle = - 2\pi i^{n+1} \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2 - \frac{2\pi^2}{n} |z|^4} z^*_k z_j\ dz d\overline{z}.

We can use (26) to write the above expression as

\displaystyle - 2\pi i^{n+1} \int_{\bf R} e^{-\pi t^2} \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2} e^{2\pi i \sqrt{\frac{2\pi}{n}} t |z|^2} z^*_k z_j\ dz d\overline{z} dt.

By (22) one has

\displaystyle \int_{{\bf C}^n} e^{2\pi i E_\eta |z|^2} e^{2\pi i \sqrt{\frac{2\pi}{n}} t |z|^2} z^*_k z_j\ dz d\overline{z}= \frac{1}{2\pi (-iE_\eta - i \sqrt{\frac{2\pi}{n}} t)^{n+1}} \delta_{jk}

where {\delta_{jk}} is the Kronecker delta function. We thus have

\displaystyle {\bf E} \frac{1}{\hbox{det}(W_n - E_\eta)} ((W_n-E_\eta)^{-1})_{jk}

\displaystyle = -i \delta_{jk} \int_{\bf R} (-E_\eta - \sqrt{\frac{2\pi}{n}} t)^{-n-1} e^{-\pi t^2}\ dt.

By introducing {n} fermionic variables {\zeta_1,\ldots,\zeta_n} and their formal adjoints {\zeta_1^*,\ldots,\zeta_n^*}, one can now eliminate the reciprocal determinant. Indeed, from (20) one has

\displaystyle \hbox{det}(W_n-E_\eta) = i^{-n} \int e^{-2\pi i \zeta^\dagger (W_n-E_\eta) \zeta}\ d\zeta^* d\zeta;

combining this with (27) one has

\displaystyle ((W_n-E_\eta)^{-1})_{jk} = - 2\pi i \int e^{-2\pi i \Phi^\dagger (W_n-E_\eta) \Phi} z^*_k z_j\ d\Phi^* d\Phi, \ \ \ \ \ (28)

where {\Phi} consists of {n} bosonic and {n} fermionic variables, and we have abused notation by identifying the {n \times n} matrix {W_n - E_\eta} with the supermatrix

\displaystyle \begin{pmatrix} W_n - E_\eta & 0 \\ 0 & W_n - E_\eta \end{pmatrix}.

Now we compute the expectation

\displaystyle \mathop{\bf E} e^{-2\pi i \Phi^\dagger W_n \Phi}.

It is convenient here to realise the Wigner ensemble {W_n} as the Hermitian part {W_n = \frac{G+G^*}{\sqrt{2n}}} of a complex gaussian matrix (with entries being iid copies of {N(0,1)_{\bf C}}). Then the above expression is

\displaystyle \int e^{-2\pi i \Phi^\dagger (G +G^*) \Phi/ \sqrt{2n}} e^{-\hbox{tr}(GG^*)}\ dG d G^*

where {dG} is a suitable Haar measure on the space of complex matrices, normalised so that

\displaystyle \int e^{-\hbox{tr}(GG^*)/2}\ dG dG^* = 1.

We can expand this as

\displaystyle \int e^{-2\pi i \hbox{tr}( (G+G^*) ( z z^\dagger - \zeta \zeta^\dagger) )/\sqrt{2n} } e^{-\hbox{tr}(GG^*)}\ dG_n,

and rearrange this as

\displaystyle e^{-\frac{2\pi^2}{n} \hbox{tr}( ( z z^\dagger - \zeta \zeta^\dagger)^2 ) }

\displaystyle \int e^{-\hbox{tr}((G- 2\pi i ( z z^\dagger - \zeta \zeta^\dagger) / \sqrt{2n}) (G^*- 2\pi i ( z z^\dagger - \zeta \zeta^\dagger) / \sqrt{2n})}\ dG dG^*.

By contour shifting {G} and {G^*} separately, we see that the integral here is still {1}. One can also compute that

\displaystyle \hbox{tr}( (z z^\dagger - \zeta \zeta^\dagger)^2 ) = (z^\dagger z)^2 - (\zeta^\dagger \zeta)^2 + 2(\zeta^\dagger z) (z^\dagger \zeta)

and so we can write

\displaystyle {\bf E} ((W_n-E_\eta)^{-1})_{jk} \ \ \ \ \ (29)

as

\displaystyle -2\pi i \int e^{2\pi i \Phi^\dagger E_\eta \Phi} e^{-\frac{2\pi^2}{n} ((z^\dagger z)^2 - (\zeta^\dagger \zeta)^2 + 2(\zeta^\dagger z) (z^\dagger \zeta))} z^*_k z_j\ d\Phi^* d\Phi. \ \ \ \ \ (30)

To simplify this {O(n)}-dimensional integral, we use the Hubbard-Stratonovich transformation. From (25) (which can be extended by analyticity from complex {x} to hosonic {x}) we have

\displaystyle e^{-\frac{2\pi^2}{n} (z^\dagger z)^2} = \int_{\bf R} e^{-\pi a^2} e^{-2\pi i a \sqrt{2\pi/n} z^\dagger z}\ da

and

\displaystyle e^{\frac{2\pi^2}{n} (\zeta^\dagger \zeta)^2} = \int_{\bf R} e^{-\pi b^2} e^{2\pi b \sqrt{2\pi/n} \zeta^\dagger \zeta}\ db

while from (17) (with {n=0,m=1}, {\Sigma=1}, {c=0}, {\Psi^\dagger = \sqrt{2\pi/n} i \zeta^\dagger z}, and {\tilde \Psi := \sqrt{2\pi/n} i z^\dagger \zeta}) one has

\displaystyle e^{-\frac{4\pi^2}{n} (\zeta^\dagger z) (z^\dagger \zeta)} = \int e^{-2\pi (\rho^* \rho + \sqrt{2\pi/n} i \zeta^\dagger z \rho + \sqrt{2\pi/n} i \rho^* z^\dagger \zeta)}\ d\rho^* d\rho

and so the expression (30) becomes

\displaystyle -2\pi i \int e^{-2\pi i \Phi^\dagger E_\eta \Phi} e^{-\pi (a^2+b^2 + 2\rho^* \rho)} e^{-2\pi i \sqrt{2\pi/n} \Phi^\dagger R_n \Phi} z^*_k z_j\ d\Phi^* d\Phi da db d\rho^* d\rho

where {R_n} is the {(n|n) \times (n|n)} supermatrix

\displaystyle R_n := \begin{pmatrix} a I_n & \rho^* I_n \\ \rho I_n & ibI_n \end{pmatrix}

and {I_n} is the {n \times n} identity matrix.

One can perform the {\Phi, \Phi^*} integrals using (21), yielding

\displaystyle -i \int \frac{ (((E_\eta - \sqrt{2\pi/n} R_n)^{-1})_{bb})_{jk} }{\hbox{Sdet}(E_\eta - \sqrt{2\pi/n} R_n)} e^{-\pi (a^2+b^2 + 2\rho^* \rho)}\ da db d\rho^* d\rho.

This expression may look somewhat complicated, but there are now only four variables of integration (two bosonic and two fermionic), and this can be evaluated exactly by a tedious but straightforward computation; see this paper of Disertori for details. After evaluating the fermionic integrals and performing some rescalings, one eventually arrives at the exact expression

\displaystyle \delta_{jk} \frac{n}{2\pi} \int_{{\bf R}^2} e^{-n(a^2+b^2)/2} \frac{(E_\eta - ib)^n}{(E_\eta-a)^{n+1}} (1 - \frac{n+1}{n} \frac{1}{(E_\eta-a)(E_\eta-ib)})\ da db

for (29), which can then be estimated by a (somewhat involved) application of the method of steepest descent; again, see the paper of Disertori for details.