You are currently browsing the tag archive for the ‘Schur complement’ tag.
The determinant of a square matrix obeys a large number of important identities, the most basic of which is the multiplicativity property
whenever are square matrices of the same dimension. This identity then generates many other important identities. For instance, if is an matrix and is an matrix, then by applying the previous identity to equate the determinants of and (where we will adopt the convention that denotes an identity matrix of whatever dimension is needed to make sense of the expressions being computed, and similarly for ) we obtain the Sylvester determinant identity
This identity, which relates an determinant with an determinant, is very useful in random matrix theory (a point emphasised in particular by Deift), particularly in regimes in which is much smaller than .
Another identity generated from (1) arises when trying to compute the determinant of a block matrix
where is an matrix, is an matrix, is an matrix, and is an matrix. If is invertible, then we can manipulate this matrix via block Gaussian elimination as
and on taking determinants using (1) we obtain the Schur determinant identity
relating the determinant of a block-diagonal matrix with the determinant of the Schur complement of the upper left block . This identity can be viewed as the correct way to generalise the determinant formula
It is also possible to use determinant identities to deduce other matrix identities that do not involve the determinant, by the technique of matrix differentiation (or equivalently, matrix linearisation). The key observation is that near the identity, the determinant behaves like the trace, or more precisely one has
for any bounded square matrix and infinitesimal . (If one is uncomfortable with infinitesimals, one can interpret this sort of identity as an asymptotic as .) Combining this with (1) we see that for square matrices of the same dimension with invertible and invertible, one has
for infinitesimal . To put it another way, if is a square matrix that depends in a differentiable fashion on a real parameter , then
Let us see some examples of this differentiation method. If we take the Sylvester identity (2) and multiply one of the rectangular matrices by an infinitesimal , we obtain
applying (4) and extracting the linear term in (or equivalently, differentiating at and then setting ) we conclude the cyclic property of trace:
To manipulate derivatives and inverses, we begin with the Neumann series approximation
for bounded square and infinitesimal , which then leads to the more general approximation
for square matrices of the same dimension with bounded. To put it another way, we have
whenever depends in a differentiable manner on and is invertible.
We can then differentiate (or linearise) the Schur identity (3) in a number of ways. For instance, if we replace the lower block by for some test matrix , then by (4), the left-hand side of (3) becomes (assuming the invertibility of the block matrix)
while the right-hand side becomes
extracting the linear term in (after dividing through by (3)), we conclude that
As was an arbitrary matrix, we conclude from duality that the lower right block of is given by the inverse of the Schur complement:
One can also compute the other components of this inverse in terms of the Schur complement by a similar method (although the formulae become more complicated). As a variant of this method, we can perturb the block matrix in (3) by an infinitesimal multiple of the identity matrix giving
By (4), the left-hand side is
From (5), we have
which relates the trace of the inverse of a block matrix, with the trace of the inverse of one of its blocks. This particular identity turns out to be useful in random matrix theory; I hope to elaborate on this in a later post.
As a final example of this method, we can analyse low rank perturbations of a large () matrix , where is an matrix and is an matrix for some . (This type of situation is also common in random matrix theory, for instance it arose in this previous paper of mine on outliers to the circular law.) If is invertible, then from (1) and (2) one has the matrix determinant lemma
if one then perturbs by an infinitesimal matrix , we have
Extracting the linear component in as before, one soon arrives at
assuming that and are both invertible; as is arbitrary, we conclude (after using the cyclic property of trace) the Sherman-Morrison formula
for the inverse of a low rank perturbation of a matrix . While this identity can be easily verified by direct algebraic computation, it is somewhat difficult to discover this identity by such algebraic manipulation; thus we see that the “determinant first” approach to matrix identities can make it easier to find appropriate matrix identities (particularly those involving traces and/or inverses), even if the identities one is ultimately interested in do not involve determinants. (As differentiation typically makes an identity lengthier, but also more “linear” or “additive”, the determinant identity tends to be shorter (albeit more nonlinear and more multiplicative) than the differentiated identity, and can thus be slightly easier to derive.)
where is an matrix, is an matrix, is an matrix, and is an matrix, assuming that , and are all invertible.
Exercise 2 Let be invertible matrices. Establish the identity
and differentiate this in to deduce the identity
(assuming that all inverses exist) and hence
Rotating by then gives
which is useful for inverting a matrix that has been split into a self-adjoint component and a skew-adjoint component .