You are currently browsing the tag archive for the ‘linear algebra’ tag.

A popular way to visualise relationships between some finite number of sets is via Venn diagrams, or more generally Euler diagrams. In these diagrams, a set is depicted as a two-dimensional shape such as a disk or a rectangle, and the various Boolean relationships between these sets (e.g., that one set is contained in another, or that the intersection of two of the sets is equal to a third) is represented by the Boolean algebra of these shapes; Venn diagrams correspond to the case where the sets are in “general position” in the sense that all non-trivial Boolean combinations of the sets are non-empty. For instance to depict the general situation of two sets ${A,B}$ together with their intersection ${A \cap B}$ and ${A \cup B}$ one might use a Venn diagram such as

(where we have given each region depicted a different color, and moved the edges of each region a little away from each other in order to make them all visible separately), but if one wanted to instead depict a situation in which the intersection ${A \cap B}$ was empty, one could use an Euler diagram such as

One can use the area of various regions in a Venn or Euler diagram as a heuristic proxy for the cardinality ${|A|}$ (or measure ${\mu(A)}$) of the set ${A}$ corresponding to such a region. For instance, the above Venn diagram can be used to intuitively justify the inclusion-exclusion formula

$\displaystyle |A \cup B| = |A| + |B| - |A \cap B|$

for finite sets ${A,B}$, while the above Euler diagram similarly justifies the special case

$\displaystyle |A \cup B| = |A| + |B|$

for finite disjoint sets ${A,B}$.

While Venn and Euler diagrams are traditionally two-dimensional in nature, there is nothing preventing one from using one-dimensional diagrams such as

or even three-dimensional diagrams such as this one from Wikipedia:

Of course, in such cases one would use length or volume as a heuristic proxy for cardinality or measure, rather than area.

With the addition of arrows, Venn and Euler diagrams can also accommodate (to some extent) functions between sets. Here for instance is a depiction of a function ${f: A \rightarrow B}$, the image ${f(A)}$ of that function, and the image ${f(A')}$ of some subset ${A'}$ of ${A}$:

Here one can illustrate surjectivity of ${f: A \rightarrow B}$ by having ${f(A)}$ fill out all of ${B}$; one can similarly illustrate injectivity of ${f}$ by giving ${f(A)}$ exactly the same shape (or at least the same area) as ${A}$. So here for instance might be how one would illustrate an injective function ${f: A \rightarrow B}$:

Cartesian product operations can be incorporated into these diagrams by appropriate combinations of one-dimensional and two-dimensional diagrams. Here for instance is a diagram that illustrates the identity ${(A \cup B) \times C = (A \times C) \cup (B \times C)}$:

In this blog post I would like to propose a similar family of diagrams to illustrate relationships between vector spaces (over a fixed base field ${k}$, such as the reals) or abelian groups, rather than sets. The categories of (${k}$-)vector spaces and abelian groups are quite similar in many ways; the former consists of modules over a base field ${k}$, while the latter consists of modules over the integers ${{\bf Z}}$; also, both categories are basic examples of abelian categories. The notion of a dimension in a vector space is analogous in many ways to that of cardinality of a set; see this previous post for an instance of this analogy (in the context of Shannon entropy). (UPDATE: I have learned that an essentially identical notation has also been proposed in an unpublished manuscript of Ravi Vakil.)

Alice Guionnet, Assaf Naor, Gilles Pisier, Sorin Popa, Dimitri Shylakhtenko, and I are organising a three month program here at the Institute for Pure and Applied Mathematics (IPAM) on the topic of Quantitative Linear Algebra.  The purpose of this program is to bring together mathematicians and computer scientists (both junior and senior) working in various quantitative aspects of linear operators, particularly in large finite dimension.  Such aspects include, but are not restricted to discrepancy theory, spectral graph theory, random matrices, geometric group theory, ergodic theory, von Neumann algebras, as well as specific research directions such as the Kadison-Singer problem, the Connes embedding conjecture and the Grothendieck inequality.  There will be several workshops and tutorials during the program (for instance I will be giving a series of introductory lectures on random matrix theory).

While we already have several confirmed participants, we are still accepting applications for this program until Dec 4; details of the application process may be found at this page.

I just learned (from Emmanuel Kowalski’s blog) that the AMS has just started a repository of open-access mathematics lecture notes.  There are only a few such sets of notes there at present, but hopefully it will grow in the future; I just submitted some old lecture notes of mine from an undergraduate linear algebra course I taught in 2002 (with some updating of format and fixing of various typos).

[Update, Dec 22: my own notes are now on the repository.]

One of the most basic theorems in linear algebra is that every finite-dimensional vector space has a finite basis. Let us give a statement of this theorem in the case when the underlying field is the rationals:

Theorem 1 (Finite generation implies finite basis, infinitary version) Let ${V}$ be a vector space over the rationals ${{\mathbb Q}}$, and let ${v_1,\ldots,v_n}$ be a finite collection of vectors in ${V}$. Then there exists a collection ${w_1,\ldots,w_k}$ of vectors in ${V}$, with ${1 \leq k \leq n}$, such that

• (${w}$ generates ${v}$) Every ${v_j}$ can be expressed as a rational linear combination of the ${w_1,\ldots,w_k}$.
• (${w}$ independent) There is no non-trivial linear relation ${a_1 w_1 + \ldots + a_k w_k = 0}$, ${a_1,\ldots,a_k \in {\mathbb Q}}$ among the ${w_1,\ldots,w_m}$ (where non-trivial means that the ${a_i}$ are not all zero).

In fact, one can take ${w_1,\ldots,w_m}$ to be a subset of the ${v_1,\ldots,v_n}$.

Proof: We perform the following “rank reduction argument”. Start with ${w_1,\ldots,w_k}$ initialised to ${v_1,\ldots,v_n}$ (so initially we have ${k=n}$). Clearly ${w}$ generates ${v}$. If the ${w_i}$ are linearly independent then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the ${w_i}$, say ${w_k}$, is a rational linear combination of the ${w_1,\ldots,w_{k-1}}$. In such a case, ${w_k}$ becomes redundant, and we may delete it (reducing the rank ${k}$ by one). We repeat this procedure; it can only run for at most ${n}$ steps and so terminates with ${w_1,\ldots,w_m}$ obeying both of the desired properties. $\Box$

In additive combinatorics, one often wants to use results like this in finitary settings, such as that of a cyclic group ${{\mathbb Z}/p{\mathbb Z}}$ where ${p}$ is a large prime. Now, technically speaking, ${{\mathbb Z}/p{\mathbb Z}}$ is not a vector space over ${{\mathbb Q}}$, because one only multiply an element of ${{\mathbb Z}/p{\mathbb Z}}$ by a rational number if the denominator of that rational does not divide ${p}$. But for ${p}$ very large, ${{\mathbb Z}/p{\mathbb Z}}$ “behaves” like a vector space over ${{\mathbb Q}}$, at least if one restricts attention to the rationals of “bounded height” – where the numerator and denominator of the rationals are bounded. Thus we shall refer to elements of ${{\mathbb Z}/p{\mathbb Z}}$ as “vectors” over ${{\mathbb Q}}$, even though strictly speaking this is not quite the case.

On the other hand, saying that one element of ${{\mathbb Z}/p{\mathbb Z}}$ is a rational linear combination of another set of elements is not a very interesting statement: any non-zero element of ${{\mathbb Z}/p{\mathbb Z}}$ already generates the entire space! However, if one again restricts attention to rational linear combinations of bounded height, then things become interesting again. For instance, the vector ${1}$ can generate elements such as ${37}$ or ${\frac{p-1}{2}}$ using rational linear combinations of bounded height, but will not be able to generate such elements of ${{\mathbb Z}/p{\mathbb Z}}$ as ${\lfloor\sqrt{p}\rfloor}$ without using rational numbers of unbounded height.

For similar reasons, the notion of linear independence over the rationals doesn’t initially look very interesting over ${{\mathbb Z}/p{\mathbb Z}}$: any two non-zero elements of ${{\mathbb Z}/p{\mathbb Z}}$ are of course rationally dependent. But again, if one restricts attention to rational numbers of bounded height, then independence begins to emerge: for instance, ${1}$ and ${\lfloor\sqrt{p}\rfloor}$ are independent in this sense.

Thus, it becomes natural to ask whether there is a “quantitative” analogue of Theorem 1, with non-trivial content in the case of “vector spaces over the bounded height rationals” such as ${{\mathbb Z}/p{\mathbb Z}}$, which asserts that given any bounded collection ${v_1,\ldots,v_n}$ of elements, one can find another set ${w_1,\ldots,w_k}$ which is linearly independent “over the rationals up to some height”, such that the ${v_1,\ldots,v_n}$ can be generated by the ${w_1,\ldots,w_k}$ “over the rationals up to some height”. Of course to make this rigorous, one needs to quantify the two heights here, the one giving the independence, and the one giving the generation. In order to be useful for applications, it turns out that one often needs the former height to be much larger than the latter; exponentially larger, for instance, is not an uncommon request. Fortunately, one can accomplish this, at the cost of making the height somewhat large:

Theorem 2 (Finite generation implies finite basis, finitary version) Let ${n \geq 1}$ be an integer, and let ${F: {\mathbb N} \rightarrow {\mathbb N}}$ be a function. Let ${V}$ be an abelian group which admits a well-defined division operation by any natural number of size at most ${C(F,n)}$ for some constant ${C(F,n)}$ depending only on ${F,n}$; for instance one can take ${V = {\mathbb Z}/p{\mathbb Z}}$ for ${p}$ a prime larger than ${C(F,n)}$. Let ${v_1,\ldots,v_n}$ be a finite collection of “vectors” in ${V}$. Then there exists a collection ${w_1,\ldots,w_k}$ of vectors in ${V}$, with ${1 \leq k \leq n}$, as well an integer ${M \geq 1}$, such that

• (Complexity bound) ${M \leq C(F,n)}$ for some ${C(F,n)}$ depending only on ${F, n}$.
• (${w}$ generates ${v}$) Every ${v_j}$ can be expressed as a rational linear combination of the ${w_1,\ldots,w_k}$ of height at most ${M}$ (i.e. the numerator and denominator of the coefficients are at most ${M}$).
• (${w}$ independent) There is no non-trivial linear relation ${a_1 w_1 + \ldots + a_k w_k = 0}$ among the ${w_1,\ldots,w_k}$ in which the ${a_1,\ldots,a_k}$ are rational numbers of height at most ${F(M)}$.

In fact, one can take ${w_1,\ldots,w_k}$ to be a subset of the ${v_1,\ldots,v_n}$.

Proof: We perform the same “rank reduction argument” as before, but translated to the finitary setting. Start with ${w_1,\ldots,w_k}$ initialised to ${v_1,\ldots,v_n}$ (so initially we have ${k=n}$), and initialise ${M=1}$. Clearly ${w}$ generates ${v}$ at this height. If the ${w_i}$ are linearly independent up to rationals of height ${F(M)}$ then we are done. Otherwise, there is a non-trivial linear relation between them; after shuffling things around, we see that one of the ${w_i}$, say ${w_k}$, is a rational linear combination of the ${w_1,\ldots,w_{k-1}}$, whose height is bounded by some function depending on ${F(M)}$ and ${k}$. In such a case, ${w_k}$ becomes redundant, and we may delete it (reducing the rank ${k}$ by one), but note that in order for the remaining ${w_1,\ldots,w_{k-1}}$ to generate ${v_1,\ldots,v_n}$ we need to raise the height upper bound for the rationals involved from ${M}$ to some quantity ${M'}$ depending on ${M, F(M), k}$. We then replace ${M}$ by ${M'}$ and continue the process. We repeat this procedure; it can only run for at most ${n}$ steps and so terminates with ${w_1,\ldots,w_m}$ and ${M}$ obeying all of the desired properties. (Note that the bound on ${M}$ is quite poor, being essentially an ${n}$-fold iteration of ${F}$! Thus, for instance, if ${F}$ is exponential, then the bound on ${M}$ is tower-exponential in nature.) $\Box$

(A variant of this type of approximate basis lemma was used in my paper with Van Vu on the singularity probability of random Bernoulli matrices.)

Looking at the statements and proofs of these two theorems it is clear that the two results are in some sense the “same” result, except that the latter has been made sufficiently quantitative that it is meaningful in such finitary settings as ${{\mathbb Z}/p{\mathbb Z}}$. In this note I will show how this equivalence can be made formal using the language of non-standard analysis. This is not a particularly deep (or new) observation, but it is perhaps the simplest example I know of that illustrates how nonstandard analysis can be used to transfer a quantifier-heavy finitary statement, such as Theorem 2, into a quantifier-light infinitary statement, such as Theorem 1, thus lessening the need to perform “epsilon management” duties, such as keeping track of unspecified growth functions such as ${F}$. This type of transference is discussed at length in this previous blog post of mine.

In this particular case, the amount of effort needed to set up the nonstandard machinery in order to reduce Theorem 2 from Theorem 1 is too great for this transference to be particularly worthwhile, especially given that Theorem 2 has such a short proof. However, when performing a particularly intricate argument in additive combinatorics, in which one is performing a number of “rank reduction arguments”, “energy increment arguments”, “regularity lemmas”, “structure theorems”, and so forth, the purely finitary approach can become bogged down with all the epsilon management one needs to do to organise all the parameters that are flying around. The nonstandard approach can efficiently hide a large number of these parameters from view, and it can then become worthwhile to invest in the nonstandard framework in order to clean up the rest of a lengthy argument. Furthermore, an advantage of moving up to the infinitary setting is that one can then deploy all the firepower of an existing well-developed infinitary theory of mathematics (in this particular case, this would be the theory of linear algebra) out of the box, whereas in the finitary setting one would have to painstakingly finitise each aspect of such a theory that one wished to use (imagine for instance trying to finitise the rank-nullity theorem for rationals of bounded height).

The nonstandard approach is very closely related to use of compactness arguments, or of the technique of taking ultralimits and ultraproducts; indeed we will use an ultrafilter in order to create the nonstandard model in the first place.

I will also discuss a two variants of both Theorem 1 and Theorem 2 which have actually shown up in my research. The first is that of the regularity lemma for polynomials over finite fields, which came up when studying the equidistribution of such polynomials (in this paper with Ben Green). The second comes up when is dealing not with a single finite collection ${v_1,\ldots,v_n}$ of vectors, but rather with a family ${(v_{h,1},\ldots,v_{h,n})_{h \in H}}$ of such vectors, where ${H}$ ranges over a large set; this gives rise to what we call the sunflower lemma, and came up in this recent paper of myself, Ben Green, and Tamar Ziegler.

This post is mostly concerned with nonstandard translations of the “rank reduction argument”. Nonstandard translations of the “energy increment argument” and “density increment argument” were briefly discussed in this recent post; I may return to this topic in more detail in a future post.

In my discussion of the Oppenheim conjecture in my recent post on Ratner’s theorems, I mentioned in passing the simple but crucial fact that the (orthochronous) special orthogonal group $SO(Q)^+$ of an indefinite quadratic form on ${\Bbb R}^3$ can be generated by unipotent elements. This is not a difficult fact to prove, as one can simply diagonalise Q and then explicitly write down some unipotent elements (the magic words here are “null rotations“). But this is a purely algebraic approach; I thought it would also be instructive to show the geometric (or dynamic) reason for why unipotent elements appear in the orthogonal group of indefinite quadratic forms in three dimensions. (I’ll give away the punch line right away: it’s because the parabola is a conic section.) This is not a particularly deep or significant observation, and will not be surprising to the experts, but I would like to record it anyway, as it allows me to review some useful bits and pieces of elementary linear algebra.

I’ve had a number of people ask me (especially in light of some recent publicity) exactly what “compressed sensing” means, and how a “single pixel camera” could possibly work (and how it might be advantageous over traditional cameras in certain circumstances). There is a large literature on the subject, but as the field is relatively recent, there does not yet appear to be a good non-technical introduction to the subject. So here’s my stab at the topic, which should hopefully be accessible to a non-mathematical audience.

For sake of concreteness I’ll primarily discuss the camera application, although compressed sensing is a more general measurement paradigm which is applicable to other contexts than imaging (e.g. astronomy, MRI, statistical selection, etc.), as I’ll briefly remark upon at the end of this post.