Sweeping a matrix rotates its graph

7 October, 2015 in expository, math.NA, math.ST | Tags: sweeping a matrix | by Terence Tao

I recently learned about a curious operation on square matrices known as sweeping, which is used in numerical linear algebra (particularly in applications to statistics), as a useful and more robust variant of the usual Gaussian elimination operations seen in undergraduate linear algebra courses. Given an ${n \times n}$ matrix ${A := (a_{ij})_{1 \leq i,j \leq n}}$ (with, say, complex entries) and an index ${1 \leq k \leq n}$ , with the entry ${a_{kk}}$ non-zero, the sweep ${\hbox{Sweep}_k[A] = (\hat a_{ij})_{1 \leq i,j \leq n}}$ of ${A}$ at ${k}$ is the matrix given by the formulae

$\displaystyle \hat a_{ij} := a_{ij} - \frac{a_{ik} a_{kj}}{a_{kk}}$

$\displaystyle \hat a_{ik} := \frac{a_{ik}}{a_{kk}}$

$\displaystyle \hat a_{kj} := \frac{a_{kj}}{a_{kk}}$

$\displaystyle \hat a_{kk} := \frac{-1}{a_{kk}}$

for all ${i,j \in \{1,\dots,n\} \backslash \{k\}}$ . Thus for instance if ${k=1}$ , and ${A}$ is written in block form as

$\displaystyle A = \begin{pmatrix} a_{11} & X \\ Y & B \end{pmatrix} \ \ \ \ \ (1)$

for some ${1 \times n-1}$ row vector ${X}$ , ${n-1 \times 1}$ column vector ${Y}$ , and ${n-1 \times n-1}$ minor ${B}$ , one has

$\displaystyle \hbox{Sweep}_1[A] = \begin{pmatrix} -1/a_{11} & X / a_{11} \\ Y/a_{11} & B - a_{11}^{-1} YX \end{pmatrix}. \ \ \ \ \ (2)$

The inverse sweep operation ${\hbox{Sweep}_k^{-1}[A] = (\check a_{ij})_{1 \leq i,j \leq n}}$ is given by a nearly identical set of formulae:

$\displaystyle \check a_{ij} := a_{ij} - \frac{a_{ik} a_{kj}}{a_{kk}}$

$\displaystyle \check a_{ik} := -\frac{a_{ik}}{a_{kk}}$

$\displaystyle \check a_{kj} := -\frac{a_{kj}}{a_{kk}}$

$\displaystyle \check a_{kk} := \frac{-1}{a_{kk}}$

for all ${i,j \in \{1,\dots,n\} \backslash \{k\}}$ . One can check that these operations invert each other. Actually, each sweep turns out to have order ${4}$ , so that ${\hbox{Sweep}_k^{-1} = \hbox{Sweep}_k^3}$ : an inverse sweep performs the same operation as three forward sweeps. Sweeps also preserve the space of symmetric matrices (allowing one to cut down computational run time in that case by a factor of two), and behave well with respect to principal minors; a sweep of a principal minor is a principal minor of a sweep, after adjusting indices appropriately.

Remarkably, the sweep operators all commute with each other: ${\hbox{Sweep}_k \hbox{Sweep}_l = \hbox{Sweep}_l \hbox{Sweep}_k}$ . If ${1 \leq k \leq n}$ and we perform the first ${k}$ sweeps (in any order) to a matrix

$\displaystyle A = \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix}$

with ${A_{11}}$ a ${k \times k}$ minor, ${A_{12}}$ a ${k \times n-k}$ matrix, ${A_{12}}$ a ${n-k \times k}$ matrix, and ${A_{22}}$ a ${n-k \times n-k}$ matrix, one obtains the new matrix

$\displaystyle \hbox{Sweep}_1 \dots \hbox{Sweep}_k[A] = \begin{pmatrix} -A_{11}^{-1} & A_{11}^{-1} A_{12} \\ A_{21} A_{11}^{-1} & A_{22} - A_{21} A_{11}^{-1} A_{12} \end{pmatrix}.$

Note the appearance of the Schur complement in the bottom right block. Thus, for instance, one can essentially invert a matrix ${A}$ by performing all ${n}$ sweeps:

$\displaystyle \hbox{Sweep}_1 \dots \hbox{Sweep}_n[A] = -A^{-1}.$

If a matrix has the form

$\displaystyle A = \begin{pmatrix} B & X \\ Y & a \end{pmatrix}$

for a ${n-1 \times n-1}$ minor ${B}$ , ${n-1 \times 1}$ column vector ${X}$ , ${1 \times n-1}$ row vector ${Y}$ , and scalar ${a}$ , then performing the first ${n-1}$ sweeps gives

$\displaystyle \hbox{Sweep}_1 \dots \hbox{Sweep}_{n-1}[A] = \begin{pmatrix} -B^{-1} & B^{-1} X \\ Y B^{-1} & a - Y B^{-1} X \end{pmatrix}$

and all the components of this matrix are usable for various numerical linear algebra applications in statistics (e.g. in least squares regression). Given that sweeps behave well with inverses, it is perhaps not surprising that sweeps also behave well under determinants: the determinant of ${A}$ can be factored as the product of the entry ${a_{kk}}$ and the determinant of the ${n-1 \times n-1}$ matrix formed from ${\hbox{Sweep}_k[A]}$ by removing the ${k^{th}}$ row and column. As a consequence, one can compute the determinant of ${A}$ fairly efficiently (so long as the sweep operations don’t come close to dividing by zero) by sweeping the matrix for ${k=1,\dots,n}$ in turn, and multiplying together the ${kk^{th}}$ entry of the matrix just before the ${k^{th}}$ sweep for ${k=1,\dots,n}$ to obtain the determinant.

It turns out that there is a simple geometric explanation for these seemingly magical properties of the sweep operation. Any ${n \times n}$ matrix ${A}$ creates a graph ${\hbox{Graph}[A] := \{ (X, AX): X \in {\bf R}^n \}}$ (where we think of ${{\bf R}^n}$ as the space of column vectors). This graph is an ${n}$ -dimensional subspace of ${{\bf R}^n \times {\bf R}^n}$ . Conversely, most subspaces of ${{\bf R}^n \times {\bf R}^n}$ arises as graphs; there are some that fail the vertical line test, but these are a positive codimension set of counterexamples.

We use ${e_1,\dots,e_n,f_1,\dots,f_n}$ to denote the standard basis of ${{\bf R}^n \times {\bf R}^n}$ , with ${e_1,\dots,e_n}$ the standard basis for the first factor of ${{\bf R}^n}$ and ${f_1,\dots,f_n}$ the standard basis for the second factor. The operation of sweeping the ${k^{th}}$ entry then corresponds to a ninety degree rotation ${\hbox{Rot}_k: {\bf R}^n \times {\bf R}^n \rightarrow {\bf R}^n \times {\bf R}^n}$ in the ${e_k,f_k}$ plane, that sends ${f_k}$ to ${e_k}$ (and ${e_k}$ to ${-f_k}$ ), keeping all other basis vectors fixed: thus we have

$\displaystyle \hbox{Graph}[ \hbox{Sweep}_k[A] ] = \hbox{Rot}_k \hbox{Graph}[A]$

for generic ${n \times n}$ ${A}$ (more precisely, those ${A}$ with non-vanishing entry ${a_{kk}}$ ). For instance, if ${k=1}$ and ${A}$ is of the form (1), then ${\hbox{Graph}[A]}$ is the set of tuples ${(r,R,s,S) \in {\bf R} \times {\bf R}^{n-1} \times {\bf R} \times {\bf R}^{n-1}}$ obeying the equations

$\displaystyle a_{11} r + X R = s$

$\displaystyle Y r + B R = S.$

The image of ${(r,R,s,S)}$ under ${\hbox{Rot}_1}$ is ${(s, R, -r, S)}$ . Since we can write the above system of equations (for ${a_{11} \neq 0}$ ) as

$\displaystyle \frac{-1}{a_{11}} s + \frac{X}{a_{11}} R = -r$

$\displaystyle \frac{Y}{a_{11}} s + (B - a_{11}^{-1} YX) R = S$

we see from (2) that ${\hbox{Rot}_1 \hbox{Graph}[A]}$ is the graph of ${\hbox{Sweep}_1[A]}$ . Thus the sweep operation is a multidimensional generalisation of the high school geometry fact that the line ${y = mx}$ in the plane becomes ${y = \frac{-1}{m} x}$ after applying a ninety degree rotation.

It is then an instructive exercise to use this geometric interpretation of the sweep operator to recover all the remarkable properties about these operations listed above. It is also useful to compare the geometric interpretation of sweeping as rotation of the graph to that of Gaussian elimination, which instead shears and reflects the graph by various elementary transformations (this is what is going on geometrically when one performs Gaussian elimination on an augmented matrix). Rotations are less distorting than shears, so one can see geometrically why sweeping can produce fewer numerical artefacts than Gaussian elimination.

26 comments

Comments feed for this article

7 October, 2015 at 11:44 am

grpaseman

Interesting! This reminds me of an analogous operation for when a_{11} is 0: multiply the first row by -1, add it to all the other rows, and then multiply the changed columns by -1. For arbitrary matrices this looks like a mess, but I was working with 0-1 matrices and wanted to preserve absolute value of the determinant. I might consider looking at those matrices again from a geometric viewpoint. Thanks for the post.

7 October, 2015 at 11:51 am

Anonymous

may want to add an “expository” tag

[Added, thanks – T.]

7 October, 2015 at 2:38 pm

YX in (2) instead of XY.

[Corrected, thanks – T.]

7 October, 2015 at 3:49 pm

Anonymous

This seems to be Gauss-Jordan elimination written with 1 table rather than 2 where you re-use the columns of the first table to hold the emerging columns of the 2nd table? That is you eliminate both below and above the main diagonal before you proceed to the next column. Because of this you can re-use the columns of A (or U) to hold the columns of A^{-1}. This also explains why the sweeps commute (zeros in the right place of the columns you have already eliminated).

In LU factorization terms you are interleaving the elementary factors of L^{-1} with that of U^{-1}. Not sure if my verbal description makes sense to anyone….

In any case this does not seem to be a good idea numerically without incorporating some form of pivoting.

7 October, 2015 at 10:09 pm

Dr. Seppo

Sweep seems to be the same as gyration, aside from a sign change. Gyrations have been studied or used in linear programming where dependent and independent variables are swapped, Tucker has been mentioned in that context. The sign change is clever though as it preserves symmetry, so the “new name” sweep is perhaps justified — but you can find old references with the search term “gyration”.

8 October, 2015 at 1:04 am

kante

The sweep operation is also known as Principal Pivot Transform, (as far as I know) introduced by Tucker [A Combinatorial Equivalence of Matrices. In R. Bellman and M. Hall Jr editors, Combinatorial Analysis, pages 129–140. AMS, Providence, 1960] (see the survey by Tsatsomeros [M.J. Tsatsomeros. Principal Pivot Transforms: Properties and Applications. Linear Algebra and its Applications 307(1-3):151–165, 2000]). It is funny because it is related to minors in (Delta-)matroids (and so Graph Minor), see for instance [S. Oum. Rank-Width and Well-Quasi-Ordering of Skew-Symmetric or
Symmetric Matrices. Linear Algebra and its Applications 436(7):2008–2036, 2012.]

8 October, 2015 at 3:51 am

tomcircle

Reblogged this on Math Online Tom Circle and commented:
Interesting!

8 October, 2015 at 4:58 am

MatjazG

Very interesting! Didn’t encounter it before but it just so happens that your post comes at a good time, since I think it could be useful for my work. :)

I have a question, though: would an obvious generalization to a “fractional sweep”: $\hbox{Sweep}_k^\alpha[A] = A \cos \alpha + \hbox{Sweep}_k[A] \sin \alpha$ be a useful concept? Geometrically it would correspond to a rotation by an angle $\alpha$ in the $\{e_k,f_k\}$ plane and hence algebraically to a fractional power $\alpha$ of the unitary (under the canonical scalar product on ${\bf R}^n \times {\bf R}^n$ from your post) operator $\hbox{Sweep}_k$ .

[I ask because for example the other fractional transforms many times turn out to be quite useful. For example, the fractional Fourier transform, which is just a rotation by the angle $\alpha$ of the Wigner quasiprobability function (if I remember correctly) is useful in many areas and also turns out to be the evolution operator for a particle in a linear harmonic oscillator potential in quantum mechanics, where $\alpha$ changes linearly with time.]

Fractional sweeps would still be commutative $\hbox{Sweep}_k^\alpha \hbox{Sweep}_l^\beta = \hbox{Sweep}_l^\beta \hbox{Sweep}_k^\alpha$ and should thus, for example, give a simple algorithmic way to calculate the fractional power $-\alpha$ of a $n \times n$ matrix $A$ when applied in succession for all $n$ with the same $\alpha$ :

$\hbox{Sweep}_1^\alpha \dots \hbox{Sweep}_n^\alpha[A] = (-1)^\alpha A^{-\alpha}$

if my thinking is correct:

If $\alpha = \frac{p}{q}$ is rational then $(\hbox{Sweep}_1^\alpha \dots \hbox{Sweep}_n^\alpha)^q[A] = (\hbox{Sweep}_1 \dots \hbox{Sweep}_n)^p[A] = (-A^{-1})^p = (-1)^p A^{-p}$ , and thus: $\hbox{Sweep}_1^\alpha \dots \hbox{Sweep}_n^\alpha[A] = (-1)^\frac{p}{q} A^{-\frac{p}{q}} = (-1)^\alpha A^{-\alpha}$ . Since this is a continuous operation in $\alpha$ and rational numbers are dense in the reals the same relation should hold for all $\alpha \in {\bf R}$ . (Perhaps not completely rigorous, I guess, but I think it should hold.)

And I’m sure there are other examples where fractional sweeps could be useful.

P.S. I’m a physicist, not a mathematician, but I’ve been reading your blog for the past two years and it inspires me to learn more things from pure mathematics and also continually gives me fresh ideas for problem solving when dealing with my physical models.

P.P.S. Also, congratulations on solving the Erdos discrepancy problem! :)

8 October, 2015 at 11:36 am

Terence Tao

It seems $\hbox{Sweep}_k^\alpha$ is a bit more complicated than this; rotation acts linearly on a graph, but not on the function defining the graph. One can see this even in one dimension: a line $y = mx$ rotated by $\alpha$ becomes a line with slope $\frac{m+\tan \alpha}{1-m \tan \alpha}$ (tangent addition rule). The corresponding formula in higher dimensions should be computable but looks a bit messy.

8 October, 2015 at 11:43 am

Avi Levy

Does the formula become simpler if we identify the graph $(X, AX)\subset \mathbb R^n\times \mathbb R^n$ with $\mathbb C^n$ , pairing up the real coordinates to produce complex coordinates and think in terms of complex multiplication?

8 October, 2015 at 12:59 pm

MatjazG

Thank you your the reply! :) Ah, yes, I see the mistake. I was a bit hasty there; should’ve been more careful … I will try (<- emphasis) to compute the correct formula anyway, as an exercise. :p

8 October, 2015 at 2:30 pm

MatjazG

Not that hard after all, but I see I made another mistake above: to have the notation that $\hbox{Sweep}_k^1[A] = \hbox{Sweep}_k[A]$ a fractional sweep meant as the power $\alpha$ of $\hbox{Sweep}_k$ corresponds to a rotation by $\phi = \frac{\pi \alpha}{2}$ (not $\alpha$ ) of the graph in the $\{e_k,f_k\}$ plane.

A rotation of the matrix graph $\hbox{Graph}[A]$ by $\phi$ in the $\{e_1,f_1\}$ plane means that we send ${(r,R,s,S)}$ to ${(r',R,s',S)}$ , where:

$r' = \cos \phi r + s \sin \phi$ ,
$s' = \sin \phi r - s \cos \phi$

So we want to express the original equations:

$a_{11} r + X R = s$
$Y r + B R = S$

as equations for $s',S$ in terms of $r',R$ .

We can do this easily by first:
– using the orthogonality of rotations to express $r,s$ in terms of $r',s'$ (of course, the equations look the same as for the other way around, just with $-\phi$ instead of $\phi$ ),
– substituting that into both original equations,
– expressing $s'$ from $r',R$ by from the first original equation, and finally,
– putting the expression for $s'$ into the second original equation.

We get:

$a'_{11} r' + X' R = s'$
$Y' r' + B' R = S$

where:

$a'_{11} = \frac{a_{11} \cos \phi - \sin \phi}{\cos \phi + a_{11} \sin \phi} = \frac{a_{11} - \tan \phi}{1 + a_{11} \tan \phi}$
$X' = \frac{X}{\cos \phi + a_{11} \sin \phi}$
$Y' = (\cos \phi - \sin \phi \frac{a_{11} \cos \phi - \sin \phi}{\cos \phi + a_{11} \sin \phi}) Y = (\cos \phi - \sin \phi a'_{11}) Y$
$B' = B - \frac{\sin \phi}{\cos \phi + a_{11} \sin \phi} Y X = B - \sin \phi Y X'$

which together form the matrix $\hbox{Sweep}_1^\alpha[A]$ (where $\alpha = \frac{2 \phi}{\pi}$ ).

Explicitly:

$A = \begin{pmatrix} a_{11} & X \\ Y & B \end{pmatrix}$
$A' = \hbox{Sweep}^\alpha_1[A] = \begin{pmatrix} a'_{11} & X' \\ Y' & B' \end{pmatrix}$

and analogously for $\hbox{Sweep}_k^\alpha$ for other $k$ (just switch the order of the basis or apply permutation matrices around this). :)

Here $A, A'$ are $n \times n$ matrices, $B, B'$ are $n - 1 \times n - 1$ matrices, $X, X'$ are $1 \times n - 1$ row vectors $Y, Y', R, R'$ are $n – 1 \times 1$ column vectors and $a_{11}, a'_{11}, r, r', s, s'$ are scalars.

Let’s do a sanity check: verifying the limit $\alpha = \phi = 0$ we get the expected identity transformation $A' = \hbox{Sweep}^0_1[A] = A$ and verifying the limit $\alpha = 1$ ( $\phi = \frac{\pi}{2}$ ) we get the normal sweep $A' = \hbox{Sweep}^1_1[A] = \hbox{Sweep}_1[A]$ .

The condition $a_{11} \neq -\cot \phi$ (unless $\alpha \in 2{\bf Z}$ ), also passes the sanity check for $\alpha = 0$ (no restriction on $a_{11}$ ) and $\alpha = 1$ (the condition $a_{11} \neq 0$ ), of course. :)

Of course, $\hbox{Sweep}_1^\alpha$ only works when $a_{11} \neq -\cot \phi$ , since otherwise we get zero denominators, necessarily giving an undefined (at least) $X'$ (unless $\alpha \in 2{\bf Z}$ ).

The equations for a fractional sweep $\hbox{Sweep}_k^\alpha$ might look a bit messy, but still not that much. They are basically almost as easy to implement on a computer as those for regular sweeps, and for many consecutive sweeps with the same $\alpha$ you basically need only a few more additions and multiplications per sweep to get the end result, since you can pre-compute $\cos \phi$ and $\sin \phi$ only once, at the beginning of the computation.

For example, if you want to compute a fractional power of a matrix $A^\alpha$ as $A^\alpha = \hbox{Sweep}_1^{-\alpha} \dots \hbox{Sweep}_n^{-\alpha}[-A]$ , as mentioned in my previous post, you can use this algorithm as an in-place computation, without the need for e.g. an eigenvalue decomposition. (My first guess to compute $A^\alpha$ would be to compute the eigenvalues, put them to some power and then recompose the matrix; with fractional sweeps this is unnecessary – but perhaps there is an even better way?)

I hope someone finds this useful. Anyway, it was nice thinking about this. :) I just hope I haven’t made any more mistakes … :p

P.S.
While writing this I noticed a mistake in your original blog post: in the last equation that has its own line (expressing $S$ after $\hbox{Sweep}_1[A]$ ) there is an $X Y$ term (which is scalar), which should instead by $Y X$ (matrix/tensor).

[Corrected, thanks – T.]

8 October, 2015 at 11:06 pm

MatjazG

P.S. Never mind about calculating $A^\alpha$ by fractional sweeps. Fractional sweeps (of course) don’t produce that as a result. :(
P.P.S. I’m also sorry for spamming the comments so much in the last day.

8 October, 2015 at 11:38 pm

Anonymous

Which $\alpha$ values have the above commutative property (independent of the matrix graph!) for all their corresponding sweep operators?
It seems that this (smallest) set of $\alpha$ values can be larger for certain symmetries of the matrix graph.

9 October, 2015 at 4:40 am

MatjazG

Commutativity should hold for all $1 \leq k, l \leq n$ and all $\alpha, \beta$ in the sense that $\hbox{Sweep}_k^\alpha \hbox{Sweep}_l^\beta = \hbox{Sweep}_l^\beta \hbox{Sweep}_k^\alpha$ .

This is because the only thing that $\hbox{Sweep}_k^\alpha$ does is that it rotates the graph of an arbitrary matrix in the $\{e_k,f_k\}$ plane by an angle $\phi = \frac{\pi}{2} \alpha$ , leaving the rest of the matrix graph (i.e. all other planes $\{e_{k'},f_{k'}\}$ for $k \neq k'$ ) completely unchanged.

So we have two cases to check: either $k \neq l$ and we have proven commutativity by the fact that they operate on different planes independently, or $k = l$ and then commutativity holds since rotations in a single plane are always commutative (i.e. the group $SO(2)$ is commutative).

P.S.
If I am posting yet another comment, let me just throw these more elegant (but equivalent) expressions for the fractional sweep $\hbox{Sweep}_1^\alpha$ in here, restoring a bit of symmetry and simplicity:

$a'_{11} = f (a_{11} \cos \phi - \sin \phi)$
$X' = f X$
$Y' = f Y$
$B' = B - f \sin \phi Y X$

where $f = \frac{1}{a_{11} \sin \phi + \cos \phi}$ .

Let me also add that I’ve checked these formulas, so that when you applying a fractional sweep first with $\alpha$ and then do another one with $\beta$ after it, that the result is indeed the same as doing a single fractional sweep with $\alpha + \beta$ (I checked the general case with Mathematica and separately by hand for general $\alpha = \beta$ and also for $\alpha = \beta = \frac{1}{2}$ ). In other words, I did the sanity check that: $\hbox{Sweep}_1^\alpha$ latex \hbox{Sweep}_1^\beta = \hbox{Sweep}_1^{\alpha+\beta}$, which also proves commutativity when $k = l$ .

So the above expressions for a fractional sweep should really be correct now! :)

P.P.S.
A note about commutativity:

We could always conceive of more contrived generalized operations which do not commute, like for example a “fractional multi-sweep”: $\hbox{Sweep}_k^\alpha[F]$ , where $F: {\bf R^n} \times \dots \times {\bf R^n} \rightarrow {\bf R^n}$ is now a multi-linear map with $m$ input vectors from ${\bf R^n}$ and one output vector in ${\bf R^n}$ , and where $\alpha$ is a vector with $\frac{m(m+1)}{2}$ components. A fractional mult-sweep would act on the graph ${\hbox{Graph}[F] := \{ (X_1, \dots, X_m, F(X_1, \dots, X_m)): X_1, \dots X_m \in {\bf R}^n \}} \subset {\bf R}^{(m+1)n}$ of the multi-linear map $F$ , with canonical basis vectors ${e_1^1,\dots,e_n^1, \dots, e_1^m,\dots,e_n^m,f_1,\dots,f_n}$ , by rotating it in the $\{e_k^1,\dots,e_k^m,f_k\}$ subspace with a rotation from the group $SO(m+1)$ described by the vector $\alpha$ . (The vector $\alpha$ has the same number components as there are independent generators of $SO(m+1)$ so we have precisely the correct number of degrees of freedom to describe rotations in the the $m+1$ dimensional subspace spanned by $\{e_k^1,\dots,e_k^m,f_k\}$ .) The regular fractional sweep is just the special case when $m = 1$ .

Now, fractional multi-sweeps also commute when $k \neq l$ (i.e. it still holds that $\hbox{Sweep}_k^\alpha \hbox{Sweep}_l^\beta = \hbox{Sweep}_l^\beta \hbox{Sweep}_k^\alpha$ in that case) since subspaces for different $k, l$ are independent, but when $k = l$ they in general do not commute any more for arbitrary $\alpha, \beta \in {\bf R}^\frac{m(m+1)}{2}$ , since the group $SO(m+1)$ is non-abelian for $m \geq 2$ . Whether or not “fractional multi-sweeps” would be useful for anything though, I have no idea (I don’t know that even about regular fractional sweeps now :p).

P.P.P.S.
Generalizing further in another direction: the fractional Fourier transformation is a special case of a linear canonical transformation (LCT; also known as the Collins diffraction integral when used to describe light beam propagation through linear elements in paraxial optics) which is a general linear transformation in the time-frequency domain of a function (for example, a fractional Fourier transform with the fraction $\alpha$ is a rotation by an angle $\phi = \frac{\pi}{2} \alpha$ in the time-frequency domain).

Analogously, would it then be interesting to consider what happens to the matrix $A$ when we apply a more general linear transformation to its graph $\hbox{Graph}[A]$ (either restricted to a $\{e_k,f_k\}$ plane or not)?

Anyway, you can always generalize almost anything, but it would be good if it could be used for something non-trivial at some point … :p

9 October, 2015 at 8:18 am

MatjazG

I just can’t seem to leave this topic alone, it just fascinates me too much. :p What I’m worried about, though, is if I am derailing the discussion under this post too much. Would it be better to move/post all of this somewhere else instead, so it doesn’t take up so much space on this blog? If that would be better, I will write these things down somewhere else; I just feel the urge to write down my findings somewhere … :p Also, any input is greatly appreciated.

—

So, I have obtained the general formula for what happens to an $n \times n$ matrix $A$ when we apply a linear transformation on its graph $\hbox{Graph}[A]$ in the $\{e_k,f_k\}$ plane, leaving the rest of the graph unmodified. Specifically, lets take $k = 1$ and express the coordinates ${(r',R,s',S)}$ after a linear transformation transformation $M$ in the $\{e_1,f_1\}$ plane as:

$\begin{pmatrix} r' \\ s' \end{pmatrix} = M \begin{pmatrix} r \\ s \end{pmatrix}$

where $M = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ is a $2 \times 2$ matrix, $a, b, c, d, r, r', s, s'$ are scalars and $R, S$ are $n-1 \times 1$ column vectors.

Then we want the following:

$A \begin{pmatrix} r \\ R \end{pmatrix} = \begin{pmatrix} s \\ S \end{pmatrix}$

where $A = \begin{pmatrix} a_{11} & X \\ Y & B \end{pmatrix}$ , to be equivalent to:

$A' \begin{pmatrix} r' \\ R \end{pmatrix} = \begin{pmatrix} s' \\ S \end{pmatrix}$

where $A' = \begin{pmatrix} a'_{11} & X' \\ Y' & B' \end{pmatrix}$ is the matrix corresponding to the transformed graph. Here, $a_{11}, a'_{11}, f, f'$ are scalars, $X, X'$ are $1 \times n-1$ row vectors, $Y, Y'$ are $n-1 \times 1$ column vectors, $B, B'$ are $n-1 \times n-1$ matrices and $A, A'$ are $n \times n$ matrices. The formulas I are obtained for $A'$ are the following:

$a'_{11} = f (a_{11} d + c)$
$X' = \det(M) f X$
$Y' = f Y$
$B' = B - f b Y X$

where $f = \frac{1}{a_{11} b + a}$ and $\det(M) = a d - b c$ is the determinant of the matrix $M$ . This works when: $a_{11} \neq -\frac{a}{b}$ (the matrix $M$ should probably also be non-degenerate = invertible = with $\det(M) \neq 0$ ).

This neatly generalizes the fractional sweep, which is just a special case when: $M = \begin{pmatrix} \cos \phi & \sin \phi \\ -\sin \phi & \cos \phi \end{pmatrix}$ , the special case of which is the regular (full) sweep with: $M = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}$ , i.e. $\phi = \frac{\pi}{2}$ in the fractional sweep.

For convenience, let’s define the above obtained $A' = \hbox{Sweep}^M_1[A]$ as the “general linear sweep” of the matrix $A$ under the matrix $M$ at $k = 1$ , or for an arbitrary $k$ (just relabel the basis vectors) as: $\hbox{Sweep}^M_k[A]$ .

Some properties of this operation (where we assume that the condition $a_{11} \neq -\frac{a}{b}$ is satisfied wherever relevant):

1.) Algebraic definition:

$\hbox{Sweep}^M_1[A] \begin{pmatrix} r' \\ R \end{pmatrix} = \begin{pmatrix} s' \\ S \end{pmatrix}$ if and only if $A \begin{pmatrix} r \\ R \end{pmatrix} = \begin{pmatrix} s \\ S \end{pmatrix}$

where $\begin{pmatrix} r' \\ s' \end{pmatrix} = M \begin{pmatrix} r \\ s \end{pmatrix}$ .

2.) Geometric definition:

$\hbox{Graph}[\hbox{Sweep}^M_k[A]] = M_k \hbox{Graph}[A]$

where $M_k$ acts as an identity in all subspaces orthogonal to the $\{e_k,f_k\}$ plane, and on the $\{e_k,f_k\}$ plane acts as the matrix $M$ (explicitly, $M_k = \text{id}_{2(k-1)} \otimes M \otimes \text{id}_{2(n-k)}$ ).

3.) Homomorphism with matrix multiplication:

$(\hbox{Sweep}^M_k \hbox{Sweep}^N_k) = \hbox{Sweep}^{M N}_k$

where $M, N$ are arbitrary non-degenerate $2 \times 2$ matrices. (I checked this with Mathematica explicitly.)

4.) Commutativity:

$\hbox{Sweep}^M_k \hbox{Sweep}^N_l = \hbox{Sweep}^N_l \hbox{Sweep}^M_k$

if and only if: either $k \neq l$ (from the geometric definition) or $k = l$ and the $2 \times 2$ matrices $M, N$ commute, that is, if: $M N = N M$ (from homomorphism with matrix multiplication).

5.) Inverse:

From homomorphism with matrix multiplication it follows that the inverse of $\hbox{Sweep}^M_k$ is obtained by using the inverse of the matrix $M$ for the general linear sweep:

$(\hbox{Sweep}^M_k)^{-1} = \hbox{Sweep}^{M^{-1}}_k$

where $M$ is a non-degenerate (invertible) $2 \times 2$ matrix.

Ok, what still remains to be done at some point is:

a) Write down the explicit formula for $\hbox{Sweep}^M_k$ instead of hiding it under the rug, so to speak, by only explicitly deriving $\hbox{Sweep}^M_1$ .

b) See what happens when we compose sweeps with the same matrix $M$ in sequence from $k = 1$ to some $k = m \leq n$ . That is, what does the matrix: $(\hbox{Sweep}^M_1 \dots \hbox{Sweep}^M_m)[A]$ look like? This should also address what the correct generalization of matrix inversion is by performing sweeps for all $k$ from $1$ to $n$ in sequence with the same $M$ .

c) What is the connection or generalization of the determinant-calculating algorithm from regular (full) sweeps?

x) Is this useful for anything?

[My commenting policy is at https://terrytao.wordpress.com/about/ . Thus far I see no violation of these policies. -T.]

9 October, 2015 at 9:47 am

MatjazG

An obvious property, implicit in the algebraic and geometric definitions and used implicitly in the inversion property:

6.) Identity:

$\hbox{Sweep}^I_k[A] = A$

for all $k$ and all $n \times n$ matrices $A$ , where $I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$ is the $2 \times 2$ identity matrix.

Also:

7.) Scaling:

For $k = 1$ , if we use $M = M(t) = t M(1)$ we get for $A'(t) = \hbox{Sweep}^{t M(1)}_1[A] = \begin{pmatrix} a'_{11}(t) & X'(t) \\ Y'(t) & B'(t) \end{pmatrix}$ :

$a'_{11}(t) = a'_{11}(1)$
$B'(t) = B'(1)$

but:

$X'(t) = t X'(1)$
$Y'(t) = \frac{1}{t} Y'(1)$

In other words, only $X'(t)$ and $Y'(t)$ change when we multiply the matrix $M(1)$ with $t$ , and they do this by scaling with $t^{\pm 1}$ . This property is also obvious from the algebraic definition.

8 October, 2015 at 7:00 am

Anonymous

There is a “the the” but it does not hurt. With this comment there are two so perhaps this comment should be deleted also.

[Corrected, thanks – T.]

8 October, 2015 at 7:40 am

Terence Tao

Thanks to the commenters who noted the alternative names for sweeping in the literature. Based on those, I was able to find a survey of Tsatsomeros at http://www.math.wsu.edu/math/faculty/tsat/files/t2.pdf who attributes the operation independently to a 1960 paper of Tucker (under the name of “Principal Pivot Transform”), a 1960 paper of Efroymson (under the name of “sweeping”), and to a 1966 paper of Duffy, Hazony, and Morrison (under the name of “gyration”), and later called “exchange” in a 1998 paper of Stewart and Stewart. It’s curious that such a widespread notion does not seem to have a consensus name (for instance, none of these terms appear as matrix operations on Wikipedia).

In statistics, it appears that sweeping is primarily applied to matrices that are either positive definite, or have a large positive definite minor. In those cases it appears that the problem of having the pivot too close to zero is largely eliminated, as the pivots are all nonnegative and one can pick the largest one at each stage. Of course, there is still a difficulty if the matrix is close to singular, but presumably there are ways to deal with this case in applications.

10 October, 2015 at 8:17 pm

victor

Is there a connection between the sweeping operation described here and the balayage (sweeping) operator in potential theory?

https://en.wikipedia.org/wiki/Balayage

10 October, 2015 at 9:13 pm

anon

Is there a standard definition of a graph as used in this post? This differs of a graph as set of vertices and edges.

11 October, 2015 at 1:45 pm

Terence Tao

See https://en.wikipedia.org/wiki/Graph_of_a_function .

12 October, 2015 at 5:07 am

arch1

second vector -> second factor

[Corrected, thanks – T.]

14 October, 2015 at 9:39 am

Yi HUANG

Hi, Terry,

At first glance, this sweeping reminds me of the
Auscher-Axelsson-McIntosh first order formalism
for second order elliptic equations.
The article I mean is
http://link.springer.com/article/10.1007/s11512-009-0108-2
The sweeping operation appears in the formula above the equation (11) on page 264.

Best regard, Yi

19 October, 2015 at 8:02 am

francisrlb

The formula for the sweep of a matrix looks similar to the formula for the cluster mutation of a matrix. Do you know of any connection between the two?

24 October, 2015 at 12:09 pm

Ammar Husain

@francisrlb, That was the first instinct I had too. But there is no mention of whether this matrix is totally positive so maybe would have to check how this transformation behaves with minors. (I haven’t done this) If so, then you could see the Poisson structure and win at usefulness in integrability gives usefulness in numerical analysis.

	Alex Gunning on A symmetric formulation of the…
	Terence Tao on On product representations of…
	domotorp on On product representations of…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on A symmetric formulation of the…
	Anonymous on On product representations of…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Alex Gunning on A symmetric formulation of the…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on It ought to be common knowledg…
	Anonymous on Work hard

Sweeping a matrix rotates its graph

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

26 comments

Leave a comment Cancel reply

For commenters

Sweeping a matrix rotates its graph

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

26 comments

Leave a comment Cancel reply

For commenters