I recently learned about a curious operation on square matrices known as sweeping, which is used in numerical linear algebra (particularly in applications to statistics), as a useful and more robust variant of the usual Gaussian elimination operations seen in undergraduate linear algebra courses. Given an matrix (with, say, complex entries) and an index , with the entry non-zero, the sweep of at is the matrix given by the formulae
The inverse sweep operation is given by a nearly identical set of formulae:
for all . One can check that these operations invert each other. Actually, each sweep turns out to have order , so that : an inverse sweep performs the same operation as three forward sweeps. Sweeps also preserve the space of symmetric matrices (allowing one to cut down computational run time in that case by a factor of two), and behave well with respect to principal minors; a sweep of a principal minor is a principal minor of a sweep, after adjusting indices appropriately.
Remarkably, the sweep operators all commute with each other: . If and we perform the first sweeps (in any order) to a matrix
with a minor, a matrix, a matrix, and a matrix, one obtains the new matrix
Note the appearance of the Schur complement in the bottom right block. Thus, for instance, one can essentially invert a matrix by performing all sweeps:
If a matrix has the form
for a minor , column vector , row vector , and scalar , then performing the first sweeps gives
and all the components of this matrix are usable for various numerical linear algebra applications in statistics (e.g. in least squares regression). Given that sweeps behave well with inverses, it is perhaps not surprising that sweeps also behave well under determinants: the determinant of can be factored as the product of the entry and the determinant of the matrix formed from by removing the row and column. As a consequence, one can compute the determinant of fairly efficiently (so long as the sweep operations don’t come close to dividing by zero) by sweeping the matrix for in turn, and multiplying together the entry of the matrix just before the sweep for to obtain the determinant.
It turns out that there is a simple geometric explanation for these seemingly magical properties of the sweep operation. Any matrix creates a graph (where we think of as the space of column vectors). This graph is an -dimensional subspace of . Conversely, most subspaces of arises as graphs; there are some that fail the vertical line test, but these are a positive codimension set of counterexamples.
We use to denote the standard basis of , with the standard basis for the first factor of and the standard basis for the second factor. The operation of sweeping the entry then corresponds to a ninety degree rotation in the plane, that sends to (and to ), keeping all other basis vectors fixed: thus we have
for generic (more precisely, those with non-vanishing entry ). For instance, if and is of the form (1), then is the set of tuples obeying the equations
The image of under is . Since we can write the above system of equations (for ) as
we see from (2) that is the graph of . Thus the sweep operation is a multidimensional generalisation of the high school geometry fact that the line in the plane becomes after applying a ninety degree rotation.
It is then an instructive exercise to use this geometric interpretation of the sweep operator to recover all the remarkable properties about these operations listed above. It is also useful to compare the geometric interpretation of sweeping as rotation of the graph to that of Gaussian elimination, which instead shears and reflects the graph by various elementary transformations (this is what is going on geometrically when one performs Gaussian elimination on an augmented matrix). Rotations are less distorting than shears, so one can see geometrically why sweeping can produce fewer numerical artefacts than Gaussian elimination.