You are currently browsing the monthly archive for March 2009.

Jean-Pierre Serre (whose papers are, of course, always worth reading) recently posted a lovely lecture on the arXiv entitled “How to use finite fields for problems concerning infinite fields”. In it, he describes several ways in which algebraic statements over fields of zero characteristic, such as ${{\mathbb C}}$, can be deduced from their positive characteristic counterparts such as ${F_{p^m}}$, despite the fact that there is no non-trivial field homomorphism between the two types of fields. In particular finitary tools, including such basic concepts as cardinality, can now be deployed to establish infinitary results. This leads to some simple and elegant proofs of non-trivial algebraic results which are not easy to establish by other means.

One deduction of this type is based on the idea that positive characteristic fields can partially model zero characteristic fields, and proceeds like this: if a certain algebraic statement failed over (say) ${{\mathbb C}}$, then there should be a “finitary algebraic” obstruction that “witnesses” this failure over ${{\mathbb C}}$. Because this obstruction is both finitary and algebraic, it must also be definable in some (large) finite characteristic, thus leading to a comparable failure over a finite characteristic field. Taking contrapositives, one obtains the claim.

Algebra is definitely not my own field of expertise, but it is interesting to note that similar themes have also come up in my own area of additive combinatorics (and more generally arithmetic combinatorics), because the combinatorics of addition and multiplication on finite sets is definitely of a “finitary algebraic” nature. For instance, a recent paper of Vu, Wood, and Wood establishes a finitary “Freiman-type” homomorphism from (finite subsets of) the complex numbers to large finite fields that allows them to pull back many results in arithmetic combinatorics in finite fields (e.g. the sum-product theorem) to the complex plane. (Van Vu and I also used a similar trick to control the singularity property of random sign matrices by first mapping them into finite fields in which cardinality arguments became available.) And I have a particular fondness for correspondences between finitary and infinitary mathematics; the correspondence Serre discusses is slightly different from the one I discuss for instance in here or here, although there seems to be a common theme of “compactness” (or of model theory) tying these correspondences together.

As one of his examples, Serre cites one of my own favourite results in algebra, discovered independently by Ax and by Grothendieck (and then rediscovered many times since). Here is a special case of that theorem:

Theorem 1 (Ax-Grothendieck theorem, special case) Let ${P: {\mathbb C}^n \rightarrow {\mathbb C}^n}$ be a polynomial map from a complex vector space to itself. If ${P}$ is injective, then ${P}$ is bijective.

The full version of the theorem allows one to replace ${{\mathbb C}^n}$ by an algebraic variety ${X}$ over any algebraically closed field, and for ${P}$ to be an morphism from the algebraic variety ${X}$ to itself, but for simplicity I will just discuss the above special case. This theorem is not at all obvious; it is not too difficult (see Lemma 4 below) to show that the Jacobian of ${P}$ is non-degenerate, but this does not come close to solving the problem since one would then be faced with the notorious Jacobian conjecture. Also, the claim fails if “polynomial” is replaced by “holomorphic”, due to the existence of Fatou-Bieberbach domains.

In this post I would like to give the proof of Theorem 1 based on finite fields as mentioned by Serre, as well as another elegant proof of Rudin that combines algebra with some elementary complex variable methods. (There are several other proofs of this theorem and its generalisations, for instance a topological proof by Borel, which I will not discuss here.)

Update, March 8: Some corrections to the finite field proof. Thanks to Matthias Aschenbrenner also for clarifying the relationship with Tarski’s theorem and some further references.

This is a continuation of the 700-799 thread of the polymath1 project, which is now full.  During the course of that thread, we have made significant progress on the three problems being focused on:

1.  Upper and lower bounds for $c_n$ for small n.

Let $c_n$ be the largest size of a set in $[3]^n$ without a combinatorial line.   We now have both human and computer-assisted proofs of the first few values of this sequence:

$c_0=1; c_1=2; c_2=6; c_3=18; c_4=52; c_5 = 150; c_6 = 450$.

The current best-known bounds for $c_7$ are $1302 \leq c_7 \leq 1348$.  Given the gap involved here, and the rate at which the complexity of the problem has increased with n, it seems unlikely that we will be able to compute $c_7$ exactly any time soon, but it is possible that some improvement can still be made here.

2. A hyper-optimistic conjecture

Consider a variant of the above problem in which each element of $[3]^n$ with a 1s, b 2s, and c 3s is weighted by the factor $\frac{a! b! c!}{n!}$; this gives $[3]^n$ a total weight of $\frac{(n+1)(n+2)}{2}$. Let $c^\mu_n$ be the largest weight of a line-free set of $[3]^n$, and let $\overline{c}^\mu_n$ be the largest size of a subset of

$\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c=n \}$

which contains no upward-pointing equilateral triangles $(a+r,b,c), (a,b+r,c), (a,b,c+r)$ with r>0. It is known that $c^\mu_n \geq \overline{c}^\mu_n$; the “hyper-optimistic conjecture” is that one in fact has $c^\mu_n = \overline{c}^\mu_n$. This would imply density Hales-Jewett for k=3.

Currently, the conjecture is verified for $n \leq 5$, where the values of $c^\mu_n = \overline{c}^\mu_n$ for $n=0,1,2,3,4,5$ are 1,2,4,6,9,12 respectively; see this page and this page for details.  It seems feasible to handle $n=6$.  Currently we know that $15 \leq \overline{c}^\mu_6 \leq 17$ and $\overline{c}^\mu_6 \leq c^\mu_6 \leq 28$.

3.  Asymptotics for Moser’s cube problem

Moser’s cube problem asks to compute the largest size $c'_n$ of a subset of the cube $[3]^n$ without geometric lines.  The first few values of $c'_n$ are known:

$c'_0=1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43$.

The best asymptotic lower bound known is still of the order of $3^n/\sqrt{n}$.  Improving this bound seems related to the well-known problem of improving the bounds in Behrend’s construction of an AP-3 free set of integers.

We are quite close now to pinning down $c'_5$; we know that it is equal to either 124 or 125, and it is looking increasingly unlikely that it is 125.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: The distribution of the smallest singular values“, submitted to Geom. Func. Anal..   This paper concerns the least singular value $\sigma_n(M)$ of a random $n \times n$ matrix $M_n = (a_{ij})_{1 \leq i,j \leq n}$ with iid entries, which for simplicity we will take to be real (we also have analogues for complex random matrices), with mean zero and variance one.  A typical model to keep in mind here is the Bernoulli model, when each $a_{ij}$ is equal to +1 or -1 with an equal probability of each, while a privileged role will be planed by the gaussian model, when each $a_{ij} \equiv N(0,1)$ is given the standard gaussian distribution.

The distribution of the least singular value $\sigma_n(M_n)$, which is of importance in smoothed analysis and also has intrinsic interest within the field of random matrices, has been intensively studied in recent years.  For instance, in the Bernoulli case, there have been several recent papers on the singularity probability ${\Bbb P}( \sigma_n(M_n) = 0 )$; it is not hard to obtain a lower bound of $(\frac{1}{2}+o(1))^n$, and this is conjectured to be the correct asymptotic.  The best upper bound so far is by Bourgain, Vu, and Wood, who obtain $(\frac{1}{\sqrt{2}}+o(1))^n$.

Upper and lower tail bounds have also been obtained, starting with the breakthrough paper of Rudelson (building upon some earlier work on rectangular matrices by Litvak, Pajor, Rudelson, and Tomczak-Jaegermann), with subsequent papers by Van and myself, by Rudelson, and also by Rudelson and Vershynin.  To oversimplify somewhat, the conclusion of this work is that the least singular value $\sigma_n(M_n)$ has size comparable to $1/\sqrt{n}$ with high probability.  The techniques are based in part on inverse Littlewood-Offord theorems.

However, in the case of the gaussian ensemble, we know more than just the expected size of the least singular value; we know its asymptotic distribution.  Indeed, it was shown by Edelman in this case that one has

${\Bbb P}( \sigma_n(M_n) \leq t/\sqrt{n} ) = 1 - e^{-t^2/2 - t} + o(1)$ (1)

for any fixed $t > 0$.  This computation was highly algebraic in nature, relying on special identities that are available only for extremely symmetric random matrix ensembles, such as the gaussian random matrix model; in particular, it is not obvious at all that the Bernoulli ensemble necessarily obeys the same distribution as the gaussian one.  Nevertheless, motivated in part by this computation, Spielman and Teng conjectured that the bound

${\Bbb P}( \sigma_n(M_n) \leq t/\sqrt{n} ) \leq t + c^n$

should hold for some $c < 1$ for, say, the Bernoulli ensemble.    This conjecture was verified up to losses of a multiplicative constant by Rudelson and Vershynin.

The main result of our paper is to show that the distribution of the least singular value is in fact universal, being asymptotically the same for all iid (real) random matrix models with the same mean and variance, and with a sufficiently high number of moment conditions.  In particular, the asymptotic (1) for the gaussian ensemble is also true for the Bernoulli ensemble. Furthermore the error term o(1) can be shown to be of the shape $O(n^{-c})$ for some c > 0, which in turn confirms the Spielman-Teng conjecture (without a loss of constant) in the polynomial size regime $t \geq n^{-c'}$ for some $c' > 0$.  We also have some further results for other low singular values (e.g. $\sigma_{n-k}(M_n)$ for fixed k) but they are harder to state, and we will not do so here.

To our knowledge, this is the first universality result for the “hard edge” of the spectrum (i.e. the least few singular values) for iid square matrix models.  [For rectangular matrices, where the hard edge is bounded away from zero, universality was recently established by Feldheim and Sodin.] The bulk distribution for the singular values of such matrices has been known for some time (it is governed by the famous Marchenko-Pastur law), while the distribution at the “soft edge” of the spectrum (i.e. the largest few singular values) was established to be universal by Soshnikov (here the distribution is governed by the Tracy-Widom law for the top singular value, and by the Airy kernel for the next few singular values).  Both of these results are basically obtained by the moment method (or close relatives of this method, such as the resolvent method).  However, the moment method is not effective for discerning the hard edge of the spectrum, since the singular values here are very small compared with the bulk and so have a negligible influence on the moments.  [In the rectangular case, where the hard edge is bounded away from zero, the moment method becomes available again, though the application of it is quite delicate; see the Feldheim-Sodin paper for details.] Instead, we proceed by observing a certain central limit theorem type behaviour for the geometry of the columns of $M_n$, which is enough to give us the desired universality; more details on the proof lie below the fold.

A key theme in real analysis is that of studying general functions ${f: X \rightarrow {\bf R}}$ or ${f: X \rightarrow {\bf C}}$ by first approximating them by “simpler” or “nicer” functions. But the precise class of “simple” or “nice” functions may vary from context to context. In measure theory, for instance, it is common to approximate measurable functions by indicator functions or simple functions. But in other parts of analysis, it is often more convenient to approximate rough functions by continuous or smooth functions (perhaps with compact support, or some other decay condition), or by functions in some algebraic class, such as the class of polynomials or trigonometric polynomials.

In order to approximate rough functions by more continuous ones, one of course needs tools that can generate continuous functions with some specified behaviour. The two basic tools for this are Urysohn’s lemma, which approximates indicator functions by continuous functions, and the Tietze extension theorem, which extends continuous functions on a subdomain to continuous functions on a larger domain. An important consequence of these theorems is the Riesz representation theorem for linear functionals on the space of compactly supported continuous functions, which describes such functionals in terms of Radon measures.

Sometimes, approximation by continuous functions is not enough; one must approximate continuous functions in turn by an even smoother class of functions. A useful tool in this regard is the Stone-Weierstrass theorem, that generalises the classical Weierstrass approximation theorem to more general algebras of functions.

As an application of this theory (and of many of the results accumulated in previous lecture notes), we will present (in an optional section) the commutative Gelfand-Neimark theorem classifying all commutative unital ${C^*}$-algebras.