You are currently browsing the monthly archive for November 2010.

One of the key difficulties in performing analysis in infinite-dimensional function spaces, as opposed to finite-dimensional vector spaces, is that the Bolzano-Weierstrass theorem no longer holds: a bounded sequence in an infinite-dimensional function space need not have any convergent subsequences (when viewed using the strong topology). To put it another way, the closed unit ball in an infinite-dimensional function space usually fails to be (sequentially) compact.
As compactness is such a useful property to have in analysis, various tools have been developed over the years to try to salvage some sort of substitute for the compactness property in infinite-dimensional spaces. One of these tools is concentration compactness, which was discussed previously on this blog. This can be viewed as a compromise between weak compactness (which is true in very general circumstances, but is often too weak for applications) and strong compactness (which would be very useful in applications, but is usually false), in which one obtains convergence in an intermediate sense that involves a group of symmetries acting on the function space in question.
Concentration compactness is usually stated and proved in the language of standard analysis: epsilons and deltas, limits and supremas, and so forth. In this post, I wanted to note that one could also state and prove the basic foundations of concentration compactness in the framework of nonstandard analysis, in which one now deals with infinitesimals and ultralimits instead of epsilons and ordinary limits. This is a fairly mild change of viewpoint, but I found it to be informative to view this subject from a slightly different perspective. The nonstandard proofs require a fair amount of general machinery to set up, but conversely, once all the machinery is up and running, the proofs become slightly shorter, and can exploit tools from (standard) infinitary analysis, such as orthogonal projections in Hilbert spaces, or the continuous-pure point decomposition of measures. Because of the substantial amount of setup required, nonstandard proofs tend to have significantly more net complexity than their standard counterparts when it comes to basic results (such as those presented in this post), but the gap between the two narrows when the results become more difficult, and for particularly intricate and deep results it can happen that nonstandard proofs end up being simpler overall than their standard analogues, particularly if the nonstandard proof is able to tap the power of some existing mature body of infinitary mathematics (e.g. ergodic theory, measure theory, Hilbert space theory, or topological group theory) which is difficult to directly access in the standard formulation of the argument.
Read the rest of this entry »

Many structures in mathematics are incomplete in one or more ways. For instance, the field of rationals ${{\bf Q}}$ or the reals ${{\bf R}}$ are algebraically incomplete, because there are some non-trivial algebraic equations (such as ${x^2=2}$ in the case of the rationals, or ${x^2=-1}$ in the case of the reals) which could potentially have solutions (because they do not imply a necessarily false statement, such as ${1=0}$, just using the laws of algebra), but do not actually have solutions in the specified field.

Similarly, the rationals ${{\bf Q}}$, when viewed now as a metric space rather than as a field, are also metrically incomplete, beause there exist sequences in the rationals (e.g. the decimal approximations ${3, 3.1, 3.14, 3.141, \ldots}$ of the irrational number ${\pi}$) which could potentially converge to a limit (because they form a Cauchy sequence), but do not actually converge in the specified metric space.

A third type of incompleteness is that of logical incompleteness, which applies now to formal theories rather than to fields or metric spaces. For instance, Zermelo-Frankel-Choice (ZFC) set theory is logically incomplete, because there exist statements (such as the consistency of ZFC) which could potentially be provable by the theory (because it does not lead to a contradiction, or at least so we believe, just from the axioms and deductive rules of the theory), but is not actually provable in this theory.

A fourth type of incompleteness, which is slightly less well known than the above three, is what I will call elementary incompleteness (and which model theorists call the failure of the countable saturation property). It applies to any structure that is describable by a first-order language, such as a field, a metric space, or a universe of sets. For instance, in the language of ordered real fields, the real line ${{\bf R}}$ is elementarily incomplete, because there exists a sequence of statements (such as the statements ${0 < x < 1/n}$ for natural numbers ${n=1,2,\ldots}$) in this language which are potentially simultaneously satisfiable (in the sense that any finite number of these statements can be satisfied by some real number ${x}$) but are not actually simultaneously satisfiable in this theory.

In each of these cases, though, it is possible to start with an incomplete structure and complete it to a much larger structure to eliminate the incompleteness. For instance, starting with an arbitrary field ${k}$, one can take its algebraic completion (or algebraic closure) ${\overline{k}}$; for instance, ${{\bf C} = \overline{{\bf R}}}$ can be viewed as the algebraic completion of ${{\bf R}}$. This field is usually significantly larger than the original field ${k}$, but contains ${k}$ as a subfield, and every element of ${\overline{k}}$ can be described as the solution to some polynomial equation with coefficients in ${k}$. Furthermore, ${\overline{k}}$ is now algebraically complete (or algebraically closed): every polynomial equation in ${\overline{k}}$ which is potentially satisfiable (in the sense that it does not lead to a contradiction such as ${1=0}$ from the laws of algebra), is actually satisfiable in ${\overline{k}}$.

Similarly, starting with an arbitrary metric space ${X}$, one can take its metric completion ${\overline{X}}$; for instance, ${{\bf R} = \overline{{\bf Q}}}$ can be viewed as the metric completion of ${{\bf Q}}$. Again, the completion ${\overline{X}}$ is usually much larger than the original metric space ${X}$, but contains ${X}$ as a subspace, and every element of ${\overline{X}}$ can be described as the limit of some Cauchy sequence in ${X}$. Furthermore, ${\overline{X}}$ is now a complete metric space: every sequence in ${\overline{X}}$ which is potentially convergent (in the sense of being a Cauchy sequence), is now actually convegent in ${\overline{X}}$.

In a similar vein, we have the Gödel completeness theorem, which implies (among other things) that for any consistent first-order theory ${T}$ for a first-order language ${L}$, there exists at least one completion ${\overline{T}}$ of that theory ${T}$, which is a consistent theory in which every sentence in ${L}$ which is potentially true in ${\overline{T}}$ (because it does not lead to a contradiction in ${\overline{T}}$) is actually true in ${\overline{T}}$. Indeed, the completeness theorem provides at least one model (or structure) ${{\mathfrak U}}$ of the consistent theory ${T}$, and then the completion ${\overline{T} = \hbox{Th}({\mathfrak U})}$ can be formed by interpreting every sentence in ${L}$ using ${{\mathfrak U}}$ to determine its truth value. Note, in contrast to the previous two examples, that the completion is usually not unique in any way; a theory ${T}$ can have multiple inequivalent models ${{\mathfrak U}}$, giving rise to distinct completions of the same theory.

Finally, if one starts with an arbitrary structure ${{\mathfrak U}}$, one can form an elementary completion ${{}^* {\mathfrak U}}$ of it, which is a significantly larger structure which contains ${{\mathfrak U}}$ as a substructure, and such that every element of ${{}^* {\mathfrak U}}$ is an elementary limit of a sequence of elements in ${{\mathfrak U}}$ (I will define this term shortly). Furthermore, ${{}^* {\mathfrak U}}$ is elementarily complete; any sequence of statements that are potentially simultaneously satisfiable in ${{}^* {\mathfrak U}}$ (in the sense that any finite number of statements in this collection are simultaneously satisfiable), will actually be simultaneously satisfiable. As we shall see, one can form such an elementary completion by taking an ultrapower of the original structure ${{\mathfrak U}}$. If ${{\mathfrak U}}$ is the standard universe of all the standard objects one considers in mathematics, then its elementary completion ${{}^* {\mathfrak U}}$ is known as the nonstandard universe, and is the setting for nonstandard analysis.

As mentioned earlier, completion tends to make a space much larger and more complicated. If one algebraically completes a finite field, for instance, one necessarily obtains an infinite field as a consequence. If one metrically completes a countable metric space with no isolated points, such as ${{\bf Q}}$, then one necessarily obtains an uncountable metric space (thanks to the Baire category theorem). If one takes a logical completion of a consistent first-order theory that can model true arithmetic, then this completion is no longer describable by a recursively enumerable schema of axioms, thanks to Gödel’s incompleteness theorem. And if one takes the elementary completion of a countable structure, such as the integers ${{\bf Z}}$, then the resulting completion ${{}^* {\bf Z}}$ will necessarily be uncountable.

However, there are substantial benefits to working in the completed structure which can make it well worth the massive increase in size. For instance, by working in the algebraic completion of a field, one gains access to the full power of algebraic geometry. By working in the metric completion of a metric space, one gains access to powerful tools of real analysis, such as the Baire category theorem, the Heine-Borel theorem, and (in the case of Euclidean completions) the Bolzano-Weierstrass theorem. By working in a logically and elementarily completed theory (aka a saturated model) of a first-order theory, one gains access to the branch of model theory known as definability theory, which allows one to analyse the structure of definable sets in much the same way that algebraic geometry allows one to analyse the structure of algebraic sets. Finally, when working in an elementary completion of a structure, one gains a sequential compactness property, analogous to the Bolzano-Weierstrass theorem, which can be interpreted as the foundation for much of nonstandard analysis, as well as providing a unifying framework to describe various correspondence principles between finitary and infinitary mathematics.

In this post, I wish to expand upon these above points with regard to elementary completion, and to present nonstandard analysis as a completion of standard analysis in much the same way as, say, complex algebra is a completion of real algebra, or real metric geometry is a completion of rational metric geometry.

Combinatorial incidence geometry is the study of the possible combinatorial configurations between geometric objects such as lines and circles. One of the basic open problems in the subject has been the Erdős distance problem, posed in 1946:

Problem 1 (Erdős distance problem) Let ${N}$ be a large natural number. What is the least number ${\# \{ |x_i-x_j|: 1 \leq i < j \leq N \}}$ of distances that are determined by ${N}$ points ${x_1,\ldots,x_N}$ in the plane?

Erdős called this least number ${g(N)}$. For instance, one can check that ${g(3)=1}$ and ${g(4)=2}$, although the precise computation of ${g}$ rapidly becomes more difficult after this. By considering ${N}$ points in arithmetic progression, we see that ${g(N) \leq N-1}$. By considering the slightly more sophisticated example of a ${\sqrt{N} \times \sqrt{N}}$ lattice grid (assuming that ${N}$ is a square number for simplicity), and using some analytic number theory, one can obtain the slightly better asymptotic bound ${g(N) = O( N / \sqrt{\log N} )}$.

On the other hand, lower bounds are more difficult to obtain. As observed by Erdős, an easy argument, ultimately based on the incidence geometry fact that any two circles intersect in at most two points, gives the lower bound ${g(N) \gg N^{1/2}}$. The exponent ${1/2}$ has been slowly increasing over the years by a series of increasingly intricate arguments combining incidence geometry facts with other known results in combinatorial incidence geometry (most notably the Szemerédi-Trotter theorem) and also some tools from additive combinatorics; however, these methods seemed to fall quite short of getting to the optimal exponent of ${1}$. (Indeed, previously to last week, the best lower bound known was approximately ${N^{0.8641}}$, due to Katz and Tardos.)

Very recently, though, Guth and Katz have obtained a near-optimal result:

Theorem 2 One has ${g(N) \gg N / \log N}$.

The proof neatly combines together several powerful and modern tools in a new way: a recent geometric reformulation of the problem due to Elekes and Sharir; the polynomial method as used recently by Dvir, Guth, and Guth-Katz on related incidence geometry problems (and discussed previously on this blog); and the somewhat older method of cell decomposition (also discussed on this blog). A key new insight is that the polynomial method (and more specifically, the polynomial Ham Sandwich theorem, also discussed previously on this blog) can be used to efficiently create cells.

In this post, I thought I would sketch some of the key ideas used in the proof, though I will not give the full argument here (the paper itself is largely self-contained, well motivated, and of only moderate length). In particular I will not go through all the various cases of configuration types that one has to deal with in the full argument, but only some illustrative special cases.

To simplify the exposition, I will repeatedly rely on “pigeonholing cheats”. A typical such cheat: if I have ${n}$ objects (e.g. ${n}$ points or ${n}$ lines), each of which could be of one of two types, I will assume that either all ${n}$ of the objects are of the first type, or all ${n}$ of the objects are of the second type. (In truth, I can only assume that at least ${n/2}$ of the objects are of the first type, or at least ${n/2}$ of the objects are of the second type; but in practice, having ${n/2}$ instead of ${n}$ only ends up costing an unimportant multiplicative constant in the type of estimates used here.) A related such cheat: if one has ${n}$ objects ${A_1,\ldots,A_n}$ (again, think of ${n}$ points or ${n}$ circles), and to each object ${A_i}$ one can associate some natural number ${k_i}$ (e.g. some sort of “multiplicity” for ${A_i}$) that is of “polynomial size” (of size ${O(N^{O(1)})}$), then I will assume in fact that all the ${k_i}$ are in a fixed dyadic range ${[k,2k]}$ for some ${k}$. (In practice, the dyadic pigeonhole principle can only achieve this after throwing away all but about ${n/\log N}$ of the original ${n}$ objects; it is this type of logarithmic loss that eventually leads to the logarithmic factor in the main theorem.) Using the notation ${X \sim Y}$ to denote the assertion that ${C^{-1} Y \leq X \leq CY}$ for an absolute constant ${C}$, we thus have ${k_i \sim k}$ for all ${i}$, thus ${k_i}$ is morally constant.

I will also use asymptotic notation rather loosely, to avoid cluttering the exposition with a certain amount of routine but tedious bookkeeping of constants. In particular, I will use the informal notation ${X \lll Y}$ or ${Y \ggg X}$ to denote the statement that ${X}$ is “much less than” ${Y}$ or ${Y}$ is “much larger than” ${X}$, by some large constant factor.

[Some advertising on behalf of my department.  The inaugural 2009 scholarship was announced on this blog last year. – T.]

Last year, the UCLA mathematics department launched a scholarship opportunity for entering freshman students with exceptional background and promise in mathematics. We intend to offer one new scholarship every year.

The UCLA Math Undergraduate Merit Scholarship provides for full tuition, and a room and board allowance for 4 years. In addition, scholarship recipients follow an individualized accelerated program of study, as determined after consultation with UCLA faculty.  [For instance, this year’s scholarship recipient is currently taking my graduate real analysis class – T.] The program of study leads to a Masters degree in Mathematics in four years.

More information and an application form for the scholarship can be found on the web at:
To be considered for Fall 2011, candidates must apply for the scholarship and also for admission to UCLA on or before November 30, 2010.

Hans Lindblad and I have just uploaded to the arXiv our joint paper “Asymptotic decay for a one-dimensional nonlinear wave equation“, submitted to Analysis & PDE.  This paper, to our knowledge, is the first paper to analyse the asymptotic behaviour of the one-dimensional defocusing nonlinear wave equation

${}-u_{tt}+u_{xx} = |u|^{p-1} u$ (1)

where $u: {\bf R} \times {\bf R} \to {\bf R}$ is the solution and $p>1$ is a fixed exponent.  Nowadays, this type of equation is considered a very simple example of a non-linear wave equation (there is only one spatial dimension, the equation is semilinear, the conserved energy is positive definite and coercive, and there are no derivatives in the nonlinear term), and indeed it is not difficult to show that any solution whose conserved energy

$E[u] := \int_{{\bf R}} \frac{1}{2} |u_t|^2 + \frac{1}{2} |u_x|^2 + \frac{1}{p+1} |u|^{p+1}\ dx$

is finite, will exist globally for all time (and remain finite energy, of course).  In particular, from the one-dimensional Gagliardo-Nirenberg inequality (a variant of the Sobolev embedding theorem), such solutions will remain uniformly bounded in $L^\infty_x({\bf R})$ for all time.

However, this leaves open the question of the asymptotic behaviour of such solutions in the limit as $t \to \infty$.  In higher dimensions, there are a variety of scattering and asymptotic completeness results which show that solutions to nonlinear wave equations such as (1) decay asymptotically in various senses, at least if one is in the perturbative regime in which the solution is assumed small in some sense (e.g. small energy).  For instance, a typical result might be that spatial norms such as $\|u(t)\|_{L^q({\bf R})}$ might go to zero (in an average sense, at least).   In general, such results for nonlinear wave equations are ultimately based on the fact that the linear wave equation in higher dimensions also enjoys an analogous decay as $t \to +\infty$, as linear waves in higher dimensions spread out and disperse over time.  (This can be formalised by decay estimates on the fundamental solution of the linear wave equation, or by basic estimates such as the (long-time) Strichartz estimates and their relatives.)  The idea is then to view the nonlinear wave equation as a perturbation of the linear one.

On the other hand, the solution to the linear one-dimensional wave equation

$-u_{tt} + u_{xx} = 0$ (2)

does not exhibit any decay in time; as one learns in an undergraduate PDE class, the general (finite energy) solution to such an equation is given by the superposition of two travelling waves,

$u(t,x) = f(x+t) + g(x-t)$ (3)

where $f$ and $g$ also have finite energy, so in particular norms such as $\|u(t)\|_{L^\infty_x({\bf R})}$ cannot decay to zero as $t \to \infty$ unless the solution is completely trivial.

Nevertheless, we were able to establish a nonlinear decay effect for equation (1), caused more by the nonlinear right-hand side of (1) than by the linear left-hand side, to obtain $L^\infty_x({\bf R})$ decay on the average:

Theorem 1. (Average $L^\infty_x$ decay) If $u$ is a finite energy solution to (1), then $\frac{1}{2T} \int_{-T}^T \|u(t)\|_{L^\infty_x({\bf R})}$ tends to zero as $T \to \infty$.

Actually we prove a slightly stronger statement than Theorem 1, in that the decay is uniform among all solutions with a given energy bound, but I will stick to the above formulation of the main result for simplicity.

Informally, the reason for the nonlinear decay is as follows.  The linear evolution tries to force waves to move at constant velocity (indeed, from (3) we see that linear waves move at the speed of light $c=1$).  But the defocusing nature of the nonlinearity will spread out any wave that is propagating along a constant velocity worldline.  This intuition can be formalised by a Morawetz-type energy estimate that shows that the nonlinear potential energy must decay along any rectangular slab of spacetime (that represents the neighbourhood of a constant velocity worldline).

Now, just because the linear wave equation propagates along constant velocity worldlines, this does not mean that the nonlinear wave equation does too; one could imagine that a wave packet could propagate along a more complicated trajectory $t \mapsto x(t)$ in which the velocity $x'(t)$ is not constant.  However, energy methods still force the solution of the nonlinear wave equation to obey finite speed of propagation, which in the wave packet context means (roughly speaking) that the nonlinear trajectory $t \mapsto x(t)$ is a Lipschitz continuous function (with Lipschitz constant at most $1$).

And now we deploy a trick which appears to be new to the field of nonlinear wave equations: we invoke the Rademacher differentiation theorem (or Lebesgue differentiation theorem), which asserts that Lipschitz continuous functions are almost everywhere differentiable.  (By coincidence, I am teaching this theorem in my current course, both in one dimension (which is the case of interest here) and in higher dimensions.)  A compactness argument allows one to extract a quantitative estimate from this theorem (cf. this earlier blog post of mine) which, roughly speaking, tells us that there are large portions of the trajectory $t \mapsto x(t)$ which behave approximately linearly at an appropriate scale.  This turns out to be a good enough control on the trajectory that one can apply the Morawetz inequality and rule out the existence of persistent wave packets over long periods of time, which is what leads to Theorem 1.

There is still scope for further work to be done on the asymptotics.  In particular, we still do not have a good understanding of what the asymptotic profile of the solution should be, even in the perturbative regime; standard nonlinear geometric optics methods do not appear to work very well due to the extremely weak decay.

This is the third in a series of posts on the “no self-defeating object” argument in mathematics – a powerful and useful argument based on formalising the observation that any object or structure that is so powerful that it can “defeat” even itself, cannot actually exist.   This argument is used to establish many basic impossibility results in mathematics, such as Gödel’s theorem that it is impossible for any sufficiently sophisticated formal axiom system to prove its own consistency, Turing’s theorem that it is impossible for any sufficiently sophisticated programming language to solve its own halting problem, or Cantor’s theorem that it is impossible for any set to enumerate its own power set (and as a corollary, the natural numbers cannot enumerate the real numbers).

As remarked in the previous posts, many people who encounter these theorems can feel uneasy about their conclusions, and their method of proof; this seems to be particularly the case with regard to Cantor’s result that the reals are uncountable.   In the previous post in this series, I focused on one particular aspect of the standard proofs which one might be uncomfortable with, namely their counterfactual nature, and observed that many of these proofs can be largely (though not completely) converted to non-counterfactual form.  However, this does not fully dispel the sense that the conclusions of these theorems – that the reals are not countable, that the class of all sets is not itself a set, that truth cannot be captured by a predicate, that consistency is not provable, etc. – are highly unintuitive, and even objectionable to “common sense” in some cases.

How can intuition lead one to doubt the conclusions of these mathematical results?  I believe that one reason is because these results are sensitive to the amount of vagueness in one’s mental model of mathematics.  In the formal mathematical world, where every statement is either absolutely true or absolutely false with no middle ground, and all concepts require a precise definition (or at least a precise axiomatisation) before they can be used, then one can rigorously state and prove Cantor’s theorem, Gödel’s theorem, and all the other results mentioned in the previous posts without difficulty.  However, in the vague and fuzzy world of mathematical intuition, in which one’s impression of the truth or falsity of a statement may be influenced by recent mental reference points, definitions are malleable and blurry with no sharp dividing lines between what is and what is not covered by such definitions, and key mathematical objects may be incompletely specified and thus “moving targets” subject to interpretation, then one can argue with some degree of justification that the conclusions of the above results are incorrect; in the vague world, it seems quite plausible that one can always enumerate all the real numbers “that one needs to”, one can always justify the consistency of one’s reasoning system, one can reason using truth as if it were a predicate, and so forth.    The impossibility results only kick in once one tries to clear away the fog of vagueness and nail down all the definitions and mathematical statements precisely.  (To put it another way, the no-self-defeating object argument relies very much on the disconnected, definite, and absolute nature of the boolean truth space $\{\hbox{true},\hbox{ false}\}$ in the rigorous mathematical world.)