The fundamental notions of calculus, namely differentiation and integration, are often viewed as being the quintessential concepts in mathematical analysis, as their standard definitions involve the concept of a limit. However, it is possible to capture most of the essence of these notions by purely algebraic means (almost completely avoiding the use of limits, Riemann sums, and similar devices), which turns out to be useful when trying to generalise these concepts to more abstract situations in which it becomes convenient to permit the underlying number systems involved to be something other than the real or complex numbers, even if this makes many standard analysis constructions unavailable. For instance, the algebraic notion of a derivation often serves as a substitute for the analytic notion of a derivative in such cases, by abstracting out the key algebraic properties of differentiation, namely linearity and the Leibniz rule (also known as the product rule).
Abstract algebraic analogues of integration are less well known, but can still be developed. To motivate such an abstraction, consider the integration functional from the space of complex-valued Schwarz functions to the complex numbers, defined by
where the integration on the right is the usual Lebesgue integral (or improper Riemann integral) from analysis. This functional obeys two obvious algebraic properties. Firstly, it is linear over , thus
for all and . Secondly, it is translation invariant, thus
for all , where is the translation of by . Motivated by the uniqueness theory of Haar measure, one might expect that these two axioms already uniquely determine after one sets a normalisation, for instance by requiring that
This is not quite true as stated (one can modify the proof of the Hahn-Banach theorem, after first applying a Fourier transform, to create pathological translation-invariant linear functionals on that are not multiples of the standard Fourier transform), but if one adds a mild analytical axiom, such as continuity of (using the usual Schwartz topology on ), then the above axioms are enough to uniquely pin down the notion of integration. Indeed, if is a continuous linear functional that is translation invariant, then from the linearity and translation invariance axioms one has
for all and non-zero reals . If is Schwartz, then as , one can verify that the Newton quotients converge in the Schwartz topology to the derivative of , so by the continuity axiom one has
Next, note that any Schwartz function of integral zero has an antiderivative which is also Schwartz, and so annihilates all zero-integral Schwartz functions, and thus must be a scalar multiple of the usual integration functional. Using the normalisation (4), we see that must therefore be the usual integration functional, giving the claimed uniqueness.
Motivated by the above discussion, we can define the notion of an abstract integration functional taking values in some vector space , and applied to inputs in some other vector space that enjoys a linear action (the “translation action”) of some group , as being a functional which is both linear and translation invariant, thus one has the axioms (1), (2), (3) for all , scalars , and . The previous discussion then considered the special case when , , , and was the usual translation action.
Once we have performed this abstraction, we can now present analogues of classical integration which bear very little analytic resemblance to the classical concept, but which still have much of the algebraic structure of integration. Consider for instance the situation in which we keep the complex range , the translation group , and the usual translation action , but we replace the space of Schwartz functions by the space of polynomials of degree at most with complex coefficients, where is a fixed natural number; note that this space is translation invariant, so it makes sense to talk about an abstract integration functional . Of course, one cannot apply traditional integration concepts to non-zero polynomials, as they are not absolutely integrable. But one can repeat the previous arguments to show that any abstract integration functional must annihilate derivatives of polynomials of degree at most :
Clearly, every polynomial of degree at most is thus annihilated by , which makes a scalar multiple of the functional that extracts the top coefficient of a polynomial, thus if one sets a normalisation
for some constant , then one has
for any polynomial . So we see that up to a normalising constant, the operation of extracting the top order coefficient of a polynomial of fixed degree serves as the analogue of integration. In particular, despite the fact that integration is supposed to be the “opposite” of differentiation (as indicated for instance by (5)), we see in this case that integration is basically (-fold) differentiation; indeed, compare (6) with the identity
In particular, we see, in contrast to the usual Lebesgue integral, the integration functional (6) can be localised to an arbitrary location: one only needs to know the germ of the polynomial at a single point in order to determine the value of the functional (6). This localisation property may initially seem at odds with the translation invariance, but the two can be reconciled thanks to the extremely rigid nature of the class , in contrast to the Schwartz class which admits bump functions and so can generate local phenomena that can only be detected in small regions of the underlying spatial domain, and which therefore forces any translation-invariant integration functional on such function classes to measure the function at every single point in space.
The reversal of the relationship between integration and differentiation is also reflected in the fact that the abstract integration operation on polynomials interacts with the scaling operation in essentially the opposite way from the classical integration operation. Indeed, for classical integration on , one has
for Schwartz functions , and so in this case the integration functional obeys the scaling law
In contrast, the abstract integration operation defined in (6) obeys the opposite scaling law
Remark 1 One way to interpret what is going on is to view the integration operation (6) as a renormalised version of integration. A polynomial is, in general, not absolutely integrable, and the partial integrals
diverge as . But if one renormalises these integrals by the factor , then one recovers convergence,
thus giving an interpretation of (6) as a renormalised classical integral, with the renormalisation being responsible for the unusual scaling relationship in (7). However, this interpretation is a little artificial, and it seems that it is best to view functionals such as (6) from an abstract algebraic perspective, rather than to try to force an analytic interpretation on them.
Now we return to the classical Lebesgue integral
As noted earlier, this integration functional has a translation invariance associated to translations along the real line , as well as a dilation invariance by real dilation parameters . However, if we refine the class of functions somewhat, we can obtain a stronger family of invariances, in which we allow complex translations and dilations. More precisely, let denote the space of all functions which are entire (or equivalently, are given by a Taylor series with an infinite radius of convergence around the origin) and also admit rapid decay in a sectorial neighbourhood of the real line, or more precisely there exists an such that for every there exists such that one has the bound
whenever . For want of a better name, we shall call elements of this space Schwartz entire functions. This is clearly a complex vector space. A typical example of a Schwartz entire function are the complex gaussians
where are complex numbers with . From the Cauchy integral formula (and its derivatives) we see that if lies in , then the restriction of to the real line lies in ; conversely, from analytic continuation we see that every function in has at most one extension in . Thus one can identify with a subspace of , and in particular the integration functional (8) is inherited by , and by abuse of notation we denote the resulting functional as also. Note, in analogy with the situation with polynomials, that this abstract integration functional is somewhat localised; one only needs to evaluate the function on the real line, rather than the entire complex plane, in order to compute . This is consistent with the rigid nature of Schwartz entire functions, as one can uniquely recover the entire function from its values on the real line by analytic continuation.
Of course, the functional remains translation invariant with respect to real translation:
However, thanks to contour shifting, we now also have translation invariance with respect to complex translation:
where of course we continue to define the translation operator for complex by the usual formula . In a similar vein, we also have the scaling law
for any , if is a complex number sufficiently close to (where “sufficiently close” depends on , and more precisely depends on the sectoral aperture parameter associated to ); again, one can verify that lies in for sufficiently close to . These invariances (which relocalise the integration functional onto other contours than the real line ) are very useful for computing integrals, and in particular for computing gaussian integrals. For instance, the complex translation invariance tells us (after shifting by ) that
when with , and then an application of the complex scaling law (and a continuity argument, observing that there is a compact path connecting to in the right half plane) gives
using the branch of on the right half-plane for which . Using the normalisation (4) we thus have
giving the usual gaussian integral formula
This is a basic illustration of the power that a large symmetry group (in this case, the complex homothety group) can bring to bear on the task of computing integrals.
One can extend this sort of analysis to higher dimensions. For any natural number , let denote the space of all functions which is jointly entire in the sense that can be expressed as a Taylor series in which is absolutely convergent for all choices of , and such that there exists an such that for any there is for which one has the bound
whenever for all , where and . Again, we call such functions Schwartz entire functions; a typical example is the function
where is an complex symmetric matrix with positive definite real part, is a vector in , and is a complex number. We can then define an abstract integration functional by integration on the real slice :
where is the usual Lebesgue measure on . By contour shifting in each of the variables separately, we see that is invariant with respect to complex translations of each of the variables, and is thus invariant under translating the joint variable by . One can also verify the scaling law
for complex matrices sufficiently close to the origin, where . This can be seen for shear transformations by Fubini’s theorem and the aforementioned translation invariance, while for diagonal transformations near the origin this can be seen from applications of one-dimensional scaling law, and the general case then follows by composition. Among other things, these laws then easily lead to the higher-dimensional generalisation
whenever is a complex symmetric matrix with positive definite real part, is a vector in , and is a complex number, basically by repeating the one-dimensional argument sketched earlier. Here, we choose the branch of for all matrices in the indicated class for which .
Now we turn to an integration functional suitable for computing complex gaussian integrals such as
where is now a complex variable
is the adjoint
is a complex matrix with positive definite Hermitian part, are column vectors in , is a complex number, and is times Lebesgue measure on . (The factors of two here turn out to be a natural normalisation, but they can be ignored on a first reading.) As we shall see later, such integrals are relevant when performing computations on the Gaussian Unitary Ensemble (GUE) in random matrix theory. Note that the integrand here is not complex analytic due to the presence of the complex conjugates. However, this can be dealt with by the trick of replacing the complex conjugate by a variable which is formally conjugate to , but which is allowed to vary independently of . More precisely, let be the space of all functions of two independent -tuples
of complex variables, which is jointly entire in all variables (in the sense defined previously, i.e. there is a joint Taylor series that is absolutely convergent for all independent choices of ), and such that there is an such that for every there is such that one has the bound
whenever . We will call such functions Schwartz analytic. Note that the integrand in (11) is Schwartz analytic when has positive definite Hermitian part, if we reinterpret as the transpose of rather than as the adjoint of in order to make the integrand entire in and . We can then define an abstract integration functional by the formula
thus can be localised to the slice of (though, as with previous functionals, one can use contour shifting to relocalise to other slices also.) One can also write this integral as
and note that the integrand here is a Schwartz entire function on , thus linking the Schwartz analytic integral with the Schwartz entire integral. Using this connection, one can verify that this functional is invariant with respect to translating and by independent shifts in (thus giving a translation symmetry), and one also has the independent dilation symmetry
for complex matrices that are sufficiently close to the identity, where . Arguing as before, we can then compute (11) as
In particular, this gives an integral representation for the determinant-reciprocal of a complex matrix with positive definite Hermitian part, in terms of gaussian expressions in which only appears linearly in the exponential:
This formula is then convenient for computing statistics such as
for random matrices drawn from the Gaussian Unitary Ensemble (GUE), and some choice of spectral parameter with ; we review this computation later in this post. By the trick of matrix differentiation of the determinant (as reviewed in this recent blog post), one can also use this method to compute matrix-valued statistics such as
However, if one restricts attention to classical integrals over real or complex (and in particular, commuting or bosonic) variables, it does not seem possible to easily eradicate the negative determinant factors in such calculations, which is unfortunate because many statistics of interest in random matrix theory, such as the expected Stieltjes transform
which is the Stieltjes transform of the density of states. However, it turns out (as I learned recently from Peter Sarnak and Tom Spencer) that it is possible to cancel out these negative determinant factors by balancing the bosonic gaussian integrals with an equal number of fermionic gaussian integrals, in which one integrates over a family of anticommuting variables. These fermionic integrals are closer in spirit to the polynomial integral (6) than to Lebesgue type integrals, and in particular obey a scaling law which is inverse to the Lebesgue scaling (in particular, a linear change of fermionic variables ends up transforming a fermionic integral by rather than ), which conveniently cancels out the reciprocal determinants in the previous calculations. Furthermore, one can combine the bosonic and fermionic integrals into a unified integration concept, known as the Berezin integral (or Grassmann integral), in which one integrates functions of supervectors (vectors with both bosonic and fermionic components), and is of particular importance in the theory of supersymmetry in physics. (The prefix “super” in physics means, roughly speaking, that the object or concept that the prefix is attached to contains both bosonic and fermionic aspects.) When one applies this unified integration concept to gaussians, this can lead to quite compact and efficient calculations (provided that one is willing to work with “super”-analogues of various concepts in classical linear algebra, such as the supertrace or superdeterminant).
Abstract integrals of the flavour of (6) arose in quantum field theory, when physicists sought to formally compute integrals of the form
where are familiar commuting (or bosonic) variables (which, in particular, can often be localised to be scalar variables taking values in or ), while were more exotic anticommuting (or fermionic) variables, taking values in some vector space of fermions. (As we shall see shortly, one can formalise these concepts by working in a supercommutative algebra.) The integrand was a formally analytic function of , in that it could be expanded as a (formal, noncommutative) power series in the variables . For functions that depend only on bosonic variables, it is certainly possible for such analytic functions to be in the Schwartz class and thus fall under the scope of the classical integral, as discussed previously. However, functions that depend on fermionic variables behave rather differently. Indeed, a fermonic variable must anticommute with itself, so that . In particular, any power series in terminates after the linear term in , so that a function can only be analytic in if it is a polynomial of degree at most in ; more generally, an analytic function of fermionic variables must be a polynomial of degree at most , and an analytic function of bosonic and fermionic variables can be Schwartz in the bosonic variables but will be polynomial in the fermonic variables. As such, to interpret the integral (14), one can use classical (Lebesgue) integration (or the variants discussed above for integrating Schwartz entire or Schwartz analytic functions) for the bosonic variables, but must use abstract integrals such as (6) for the fermonic variables, leading to the concept of Berezin integration mentioned earlier.
In this post I would like to set out some of the basic algebraic formalism of Berezin integration, particularly with regards to integration of gaussian-type expressions, and then show how this formalism can be used to perform computations involving GUE (for instance, one can compute the density of states of GUE by this machinery without recourse to the theory of orthogonal polynomials). The use of supersymmetric gaussian integrals to analyse ensembles such as GUE appears in the work of Efetov (and was also proposed in the slightly earlier works of Parisi-Sourlas and McKane, with a related approach also appearing in the work of Wegner); the material here is adapted from this survey of Mirlin, as well as the later papers of Disertori-Pinson-Spencer and of Disertori.
Read the rest of this entry »