Many structures in mathematics are *incomplete* in one or more ways. For instance, the field of rationals or the reals are *algebraically incomplete*, because there are some non-trivial algebraic equations (such as in the case of the rationals, or in the case of the reals) which could *potentially* have solutions (because they do not imply a necessarily false statement, such as , just using the laws of algebra), but do not *actually* have solutions in the specified field.

Similarly, the rationals , when viewed now as a metric space rather than as a field, are also *metrically incomplete*, beause there exist sequences in the rationals (e.g. the decimal approximations of the irrational number ) which could *potentially* converge to a limit (because they form a Cauchy sequence), but do not *actually* converge in the specified metric space.

A third type of incompleteness is that of *logical incompleteness*, which applies now to formal theories rather than to fields or metric spaces. For instance, Zermelo-Frankel-Choice (ZFC) set theory is logically incomplete, because there exist statements (such as the consistency of ZFC) which could *potentially* be provable by the theory (because it does not lead to a contradiction, or at least so we believe, just from the axioms and deductive rules of the theory), but is not *actually* provable in this theory.

A fourth type of incompleteness, which is slightly less well known than the above three, is what I will call *elementary incompleteness* (and which model theorists call the failure of the *countable saturation property*). It applies to any structure that is describable by a first-order language, such as a field, a metric space, or a universe of sets. For instance, in the language of ordered real fields, the real line is elementarily incomplete, because there exists a sequence of statements (such as the statements for natural numbers ) in this language which are *potentially* simultaneously satisfiable (in the sense that any finite number of these statements can be satisfied by some real number ) but are not *actually* simultaneously satisfiable in this theory.

In each of these cases, though, it is possible to start with an incomplete structure and *complete* it to a much larger structure to eliminate the incompleteness. For instance, starting with an arbitrary field , one can take its algebraic completion (or algebraic closure) ; for instance, can be viewed as the algebraic completion of . This field is usually significantly larger than the original field , but contains as a subfield, and every element of can be described as the solution to some polynomial equation with coefficients in . Furthermore, is now algebraically complete (or algebraically closed): every polynomial equation in which is potentially satisfiable (in the sense that it does not lead to a contradiction such as from the laws of algebra), is actually satisfiable in .

Similarly, starting with an arbitrary metric space , one can take its metric completion ; for instance, can be viewed as the metric completion of . Again, the completion is usually much larger than the original metric space , but contains as a subspace, and every element of can be described as the limit of some Cauchy sequence in . Furthermore, is now a complete metric space: every sequence in which is potentially convergent (in the sense of being a Cauchy sequence), is now actually convegent in .

In a similar vein, we have the Gödel completeness theorem, which implies (among other things) that for any consistent first-order theory for a first-order language , there exists at least one *completion* of that theory , which is a consistent theory in which every sentence in which is potentially true in (because it does not lead to a contradiction in ) is actually true in . Indeed, the completeness theorem provides at least one model (or structure) of the consistent theory , and then the completion can be formed by interpreting every sentence in using to determine its truth value. Note, in contrast to the previous two examples, that the completion is usually not unique in any way; a theory can have multiple inequivalent models , giving rise to distinct completions of the same theory.

Finally, if one starts with an arbitrary structure , one can form an *elementary completion* of it, which is a significantly larger structure which contains as a substructure, and such that every element of is an elementary limit of a sequence of elements in (I will define this term shortly). Furthermore, is elementarily complete; any sequence of statements that are potentially simultaneously satisfiable in (in the sense that any finite number of statements in this collection are simultaneously satisfiable), will actually be simultaneously satisfiable. As we shall see, one can form such an elementary completion by taking an ultrapower of the original structure . If is the *standard universe* of all the *standard* objects one considers in mathematics, then its elementary completion is known as the *nonstandard universe*, and is the setting for nonstandard analysis.

As mentioned earlier, completion tends to make a space much larger and more complicated. If one algebraically completes a finite field, for instance, one necessarily obtains an infinite field as a consequence. If one metrically completes a countable metric space with no isolated points, such as , then one necessarily obtains an uncountable metric space (thanks to the Baire category theorem). If one takes a logical completion of a consistent first-order theory that can model true arithmetic, then this completion is no longer describable by a recursively enumerable schema of axioms, thanks to Gödel’s incompleteness theorem. And if one takes the elementary completion of a countable structure, such as the integers , then the resulting completion will necessarily be uncountable.

However, there are substantial benefits to working in the completed structure which can make it well worth the massive increase in size. For instance, by working in the algebraic completion of a field, one gains access to the full power of algebraic geometry. By working in the metric completion of a metric space, one gains access to powerful tools of real analysis, such as the Baire category theorem, the Heine-Borel theorem, and (in the case of Euclidean completions) the Bolzano-Weierstrass theorem. By working in a logically and elementarily completed theory (aka a *saturated model*) of a first-order theory, one gains access to the branch of model theory known as *definability theory*, which allows one to analyse the structure of definable sets in much the same way that algebraic geometry allows one to analyse the structure of algebraic sets. Finally, when working in an elementary completion of a structure, one gains a *sequential compactness* property, analogous to the Bolzano-Weierstrass theorem, which can be interpreted as the foundation for much of nonstandard analysis, as well as providing a unifying framework to describe various correspondence principles between finitary and infinitary mathematics.

In this post, I wish to expand upon these above points with regard to elementary completion, and to present nonstandard analysis as a completion of standard analysis in much the same way as, say, complex algebra is a completion of real algebra, or real metric geometry is a completion of rational metric geometry.

** — 1. Elementary convergence — **

In order to understand the concept of a metric completion of a metric space , one needs to know about the distinction between a Cauchy sequence and a convergent sequence. Similarly, to talk about the elementary completion of a structure , one needs the notion of an elementarily Cauchy sequence and an elementarily convergent sequence.

Let us set out some notation. We assume that we have some first-order language , which allows one to form sentences involving the first-order logical symbols (, , , , , , etc.), variables of one or more types, the equality symbol , some constant symbols, and some operations and relations. For instance:

- could be the language of multiplicative groups, in which there is only one type of object (a group element), a constant symbol , a binary operation from pairs of group elements to group elements, and a unary operation from group elements to group elements.
- could be the language of real ordered fields, in which there is one type of object (a field element), constant symbols , binary operations , and unary operations (with the latter only being defined for non-zero elements), and the order relation .
- could be the language of (formal) metric spaces, in which there are two types of objects (points in the space, and real numbers), the constants, operations and relations of a real ordered field, and a metric operation from pairs of points in the space to real numbers.
- could be the language of sets, in which there is one type of object (a set) and one relation .
- etc., etc.

We assume that the language has at most countably many types, constants, operations, and relations. In particular, there are at most countably many sentences in .

A structure for a language is a way of interpreting each of the object classes in as a set, and each of the constants, operations, and relations as an elements, functions, and relations on those sets respectively. For instance, a structure for the language of groups would be a set , together with a constant symbol , a binary function , and a unary operation . In particular, groups are structures for the language of groups, but so are many non-groups. Each structure can be used to *interpret* any given sentence in , giving it a truth value of true or false. We write if is interpreted to be true by . For instance, the axioms of a group can be expressed as a single sentence , and a structure for the language of groups is a group if and only if .

Now we introduce the notion of elementary convergence.

Definition 1 (Elementary convergence)Let be a structure for a language , and let be a sequence of objects in (all of the same type). Let be another object in of the same type as the .

- We say that the sequence is
elementarily Cauchyif, for every predicate that takes one variable of the same type as the as input, the truth value of becomes eventually constant (i.e. either is true for all sufficiently large , or is false for all sufficiently large ). We write this eventual truth value as .- We say that the sequence is
elementarily convergentto if we have for every predicate that takes one variable of the same type as the or as input.

Remark 1One can view the predicates (or more precisely, the sets ) as generating a topology on (or more precisely, on the domain of one of the object types of in ), in which case elementary convergence can be interpreted as convergence in this topology. Indeed, as there are only countably many predicates, this topology is metrisable.

To give an example, let us use the language of ordered fields , with the model , and pick a transcendental number , e.g. . Then the sequence is elementarily convergent to . The reason for this is that the language is fairly limited in nature, and as such it can only define a fairly small number of sets; in particular, if is a predicate of one variable, then the Tarski-Seidenberg theorem tells us that the set cut out by that set has to be a semi-algebraic set over the algebraic reals, i.e. a finite union of (possibly unbounded) intervals (which can be open, closed, or half-open) whose endpoints are algebraic reals. In particular, a transcendental number , if it lies in such a set, lies in the interior of such a set, and so will also lie in such a set for large enough, and similarly if lies outside such a set.

In contrast, if one picks an algebraic number for , such as , then does *not* converge in an elementary sense to , because one can find a predicate such as which is true for but not true for any of the . So the language has sufficiently “poor vision” that it cannot easily distinguish a transcendental number such as from nearby numbers such as , but its vision is significantly better at algebraic numbers, and in particular can distinguish from easily. So we see that elementary convergence is, in this case, a slightly stronger concept than the usual topological or metric notion of convergence on .

In the case of the real model of ordered fields, elementary limits are unique, but this is not the case in general. For instance, in the language of fields, and using the complex model , any given complex number is elementarily indistinguishable from its complex conjugate , and so any sequence of complex numbers that would converge elementarily to , would also converge elementarily to . (In fact, there is an enormous Galois group , the action of which is completely undetectable with regards to elementary convergence.)

A related problem is that the operations on a structure are not necessarily continuous with respect to these elementary limits. For instance, if are sequences of real numbers that converge elementarily to respectively, it is not necessarily the case that converge to (consider for instance the case when and ).

One way to partially resolve these problem is to consider the convergence not just of sequences of individual objects , but of sequences of families of objects:

Definition 2 (Joint elementary convergence)Let be a structure a language , let be a set, and for each natural number , let be a tuple of of elements in , and let be another tuple in , with each having the same type as .

- We say that the tuples are
jointly elementarily Cauchyif, for every natural number , every predicate of variables in of the appropriate type, and every , the truth value of is eventually constant.- We say that the tuples are
jointly elementarily convergentto if, for every natural number , every predicate of variables in of the appropriate type, and every , the truth value of converges to the truth value of as .

For instance, using the complex model of the language of fields, if converges elementarily to (say) , then we cannot prevent from also converging elementarily to . (Indeed, it is not hard to see that converges elementarily to if and only for all sufficiently large .) But if we ask that jointly converges to , then will not also jointly converge to (though it does jointly converge to ).

In a similar fashion, if are reals that converge *jointly* elementarily to , then will converge elemntarily to also.

Now we give a more sophisticated example. Here, is the language of set theory, and is a model of ZFC. In ZFC set theory, we can of course construct most of the objects we are used to in modern mathematics, such as a copy of the real line, a copy of the natural numbers, and so forth. Note that ‘s interpretation of the natural numbers may be different from the “true” natural numbers ; in particular, in non-standard models of set theory, may be much larger than (e.g. it may be an ultrapower of ). Because of this, we will be careful to subscript ‘s copies of such objects in order to distinguish them from their true counterparts, though it will not make much difference for this immediate example.

We can also define in all the formal apparatus needed for probability theory, such as a probability space and a real-valued random variable on that space.

Now suppose that inside we have a sequence of probability spaces, and a sequence of random variables on these probability spaces. Now suppose the quintuple is jointly elementarily convergent to a limit . The axioms of being a probability space can be encoded inside the first order language of set theory, so the limit is also a probability space (as viewed inside ). Similarly, is a random variable on this probability space.

Now let be rational numbers. If , then by the definition of elementary convergence (and the fact that rational numbers can be defined using an expression of finite length in the language ), we see that holds for all sufficiently large . From this, one can deduce that the converge in distribution to . Thus we see that in this case, joint elementary convergence is at least as strong as convergence in distribution, though (much as with the example with elementary convergence in using the language of ordered fields) the two notions of convergence are not equivalent.

We included in the quintuple due to the use of real numbers such as in the above discussion, but it is not strictly necessary, because one can construct uniquely in from the axioms of set theory by using one of the standard constructions of the real numbers. But note that while we may use the set of real numbers in the above elementary convergence, one cannot invoke specific real numbers unless they are “constructible” in the sense that they can be uniquely specified in the language . If one wished to be able to use arbitrary real numbers as constants, one would not only place into the quintuple, but also place in every element of into the tuple (thus making the tuple quite large, and most likely uncountable, though note from Skolem’s paradox that it is possible for to be (externally) countable even as it is uncountable from the internal perspective of ).

As we see from the above discussion, joint elementary convergence is a useful notion even when some of the elements in the tuple are constant. We isolate this case with another definition:

Definition 3 (Joint relative elementary convergence)Let be a structure for a language , let be sets, and for each natural number , let be a tuple of of elements in , and let and be further tuples in . We say that the tuples arejointly elementarily convergenttorelative to the constantsif the disjoint union is jointly elementarily convergent to .We define the notion of are

jointly elementarily Cauchy relative to the constantssimilarly.

Informally, if is jointly elementarily convergent to relative to if the language is unable to asymptotically distinguish the from the , even if it “knows” about the constants .

** — 2. Elementary completion — **

Not every sequence in a structure is elementarily Cauchy or elementarily convergent. However, we have the following simple fact:

Proposition 4 (Arzelá-Ascoli)Let be a structure for a language , and let be a sequence of elements in (all of the same type). Then there is a subsequence of the that is elementarily Cauchy.

*Proof:* There are at most countably many predicates of a single variable of the right type in . By the infinite pigeonhole principle, we can find a subsequence of the such that is eventually constant. We can find a further subsequence of that sequence for which is eventually constant. We continue extracting subsequences of this nature for each , and then the diagonal sequence is elementarily Cauchy, as desired.

The same argument works when considering countably many variables and countably many constants (as there are still only countably many predicates to deal with):

Proposition 5 (Arzelá-Ascoli)Let be a structure for a language , let be at most countable sets, and for each natural number , let be a tuple of of elements in , and let be another tuple in . Then there is a subsequence which is jointly elementarily Cauchy relative to .

As in the metric case, not every elementarily Cauchy sequence is elementarily convergent, and not every sequence has an elementarily convergent subsequence. For instance, in the language of ordered fields, using the structure , the sequence has no elementarily convergent subsequence (because any limit of such a subsequence would be positive but also less than for arbitrarily large , contradicting the Archimedean property of the reals). From Proposition 4, we conclude that the reals are elementarily incomplete; there must exist some subsequence of that is elementarily Cauchy, but not elementarily convergent.

However, we can always complete any structure by passing to the ultrapower . This concept has been discussed in previous blog posts, so we give only a quick review here.

For the rest of this post, we fix a single non-principal ultrafilter on the (standard) natural numbers . (See this previous blog post for some basic discussion of what non-principal ultrafilters are, and how they are used in non-standard analysis.) A property of a natural number is said to hold *for all sufficiently close to * if the set of for which holds lies in the ultrafilter .

Definition 6 (Ultrapower)Let be a structure for some language . Given two sequences and of objects in , we say that the sequences areequivalentif one has for all sufficiently close to . The equivalence class associated to a given sequence will be called theultralimitof the and denoted . Theultrapowerof is the collection of all ultralimits of sequences of objects in . By identifying with for every object in , we see that every object in can be identified with an object in . We refer to elements of asstandard objects, and elements of asnon-standard objects.Every relation and operation in can be extended to by taking ultralimits. For instance, given a -ary relation , and non-standard objects for , we say that holds in if and only if holds in for all sufficiently close to . Similarly, given a -ary operation and non-standard objects , we define the non-standard object to be the ultralimit of the standard objects .

Ultrapowers are also discussed in more detail in this previous blog post. A fundamental theorem of Los asserts that the ultrapower is elementarily equivalent to : any sentence in which is true in , is also true in , and vice versa; this fact is also known as the transfer principle for nonstandard analysis. For instance, the ultrapower of an ordered field is an ordered field, the ultrapower of an algebraically closed field is an algebraically closed field, and so forth. One must be slightly careful, though, with models that involve standard objects such as a copy of the natural numbers, or a copy of the real numbers; the ultrapower will have their own non-standard copy and of these objects, which are considerably larger than their standard counterparts, in the sense that they contain many more elements. Thus, for instance, if one is taking the ultrapower of a standard probability space , in which the probability measure takes values in the standard reals, the ultrapower is a non-standard probability space, in which the non-standard probability measure now takes values in the non-standard reals.

One can view the ultrapower as the completion of , in much the same way as the reals are a completion of the rationals:

Theorem 7 (Elementary completeness)Every elementarily Cauchy sequence in an ultrapower is elementarily convergent.

This property is also known as countable saturation.

*Proof:* We can write for each natural number and a sequence of standard objects in . As before, we enumerate the predicates of one variable. For each natural number , the truth value of becomes eventually constant; we will call this constant .

Now let be a standard natural number. By construction, there exists an such that

for all . As is the ultralimit of the , there thus exists a set such that

for all . By replacing each with if necessary, we may assume that the are decreasing: .

For each , let be the largest integer in such that , or if no such integer exists. By construction, we see that for any , we haved

whenever and . If we then set to be the non-standard object , we thus have

for each , and thus converges elementarily to as required.

Combining this theorem with Proposition 4 we conclude an analogue of the Bolzano-Weierstrass theorem for ultrapowers:

Corollary 8 (Bolzano-Weierstrass for ultrapowers)In an ultrapower , every sequence of non-standard objects in has an elementarily convergent subsequence .

The same argument works (but with more complicated notation) for countable families of objects, and with countably many constants:

Theorem 9 (Bolzano-Weierstrass for ultrapowers, II)Let be an ultrapower, let be at most countable, let be a sequence of tuples of nonstandard objects in , and let be another sequence of tuples of nonstandard objects. Then there is a subsequence which converges jointly elementarily to a limit relative to the constants .

The proof of this theorem proceeds almost exactly as in the single variable case, the key point being that the number of predicates that one has to stabilise remains countable.

Remark 2If one took the ultrafilter over a larger set than the natural numbers , then one could make the sets larger as well. Such larger saturation properties, beyond countable saturation, are useful in model theory (particularly when combined with the use of large cardinals, such as inaccessible cardinals), but we will not need them here.

Conversely, every nonstandard object can be viewed as the elementary limit of standard objects:

Proposition 10Let be a nonstandard object. Then there is a sequence of standard objects that converges elementarily to .

*Proof:* Let be an enumeration of the predicates of one variable. For any natural number , there exists a nonstandard object such that has the same truth value as for all , namely . By transfer, there must therefore exist a standard object such that has the same truth value as . Thus converges elementarily to , and the claim follows.

Exercise 1If is the ultralimit of a sequence of standard objects, show that there is a subsequence that converges elementarily to .

Exercise 2 (Heine-Borel theorem for structures)Given any structure , show that the following four statements are equivalent:

- (Countable saturation) If are a countable family of predicates, such that if any finite number of are simultaneously satisfiable in (i.e. for each there exists such that holds for all ), then the entire family of are simultaneously satisfiable (i.e. there exists such that holds for all ).
- (Countable compactness) Every countable cover of by sets of the form for some predicate , has a finite subcover.
- (Elementary completeness) Every elementarily Cauchy sequence in has an elementarily convergent subsequence.
- (Bolzano-Weierstrass property) Every elementary sequence in has an elementarily convergent subsequence.

From Proposition 10 and Theorem 7 we see that can be viewed as an elementary completion of , though the analogy with metric completion is not perfect because elementary limits are not unique.

The Bolzano-Weierstrass theorem for ultrapowers can then be used to derive the foundational properties of nonstandard analysis. For instance, consider the standard natural numbers in , and hence in . Applying the Bolzano-Weierstrass theorem for ultraproducts, we conclude that some subsequence of natural numbers will converge elementarily to a non-standard natural number . For any standard natural number , we have for all sufficiently large , and hence on taking elementary limits we have (since is constructible). Thus we have constructed an *unbounded* nonstandard natural number, i.e. a number which is larger than all standard natural numbers.

In a similar spirit, we can also use the Bolzano-Weierstrass theorem construct *infinitesimal* nonstandard real numbers which are positive, but less than every standard positive real number (and in particular less than for any standard ).

More generally, we have the overspill principle: if is a predicate involving some non-standard constants , such that is true for arbitrarily large standard natural numbers , then it must also be true for at least one unbounded nonstandard natural number . Indeed, one simply takes a sequence of standard natural numbers for which , and extracts a subsequence of these which converges elementarily to a non-standard limit (relative to ), which must then be unbounded. Contrapositively, if holds for all unbounded , then it must also hold for all sufficiently large standard .

Similarly, we have the *underspill principle*: if a predicate is true for arbitrarily small positive standard real , then it must also be true for at least one infinitesimal positive non-standard real ; and contrapositively, if it is true for all infinitesimal positive non-standard real , then it is also true for all sufficiently small standard real .

A typical application of these principles is in the nonstandard formulation of continuity:

Proposition 11 (Nonstandard formulation of continuity)Let be a standard function, which can then be extended by ultralimits to the nonstandard completion . Let . Then the following are equivalent:

- is continuous at .
- One has whenever is a nonstandard real such that .

*Proof:* If is continuous, then the “epsilon-delta” definition implies that whenever (so that for every standard ), one has for every standard (by transfer), and thus . Conversely, if is discontinuous at , then there exists a sequence of standard reals converging to such that for some standard ; taking ultralimits using Bolzano-Weierstrass to extract a subsequence that is elementarily convergent relative to , we thus have a non-standard with , so that .

Exercise 3With the notation as above, show that is uniformly continuous if and only if whenever are such that .

Exercise 4With the notation as above, show that is differentiable at a standard real with derivative if and only if for all nonstandard reals .

** — 3. The correspondence principle — **

One can use the Bolzano-Weierstrass theorem for ultrapowers to establish various versions of the correspondence principle, as discussed previously on this blog. A simple example occurs when demonstrating the equivalence of colouring theorems, such as the following:

Theorem 12 (van der Waerden theorem, infinitary version)Suppose the integers are coloured by finitely many colours. Then there exist arbitrarily long monochromatic arithmetic progressions.

Theorem 13 (van der Waerden theorem, finitary version)For every and there exists such that whenever is coloured by colours, there exists a monochromatic arithmetic progression of length .

It is clear that Theorem 13 implies Theorem 12. To deduce Theorem 12 from Theorem 13, we can argue as follows. Suppose Theorem 13 fails, then there exists and , and arbitrarily large standard natural number for which there exists a -colouring of without any monochromatic arithmetic progressions of length . Applying the overspill principle (or the Bolzano-Weierstrass theorem), there must then also exist an unbounded nonstandard natural number for which there exists a -colouring of without any monochromatic arithmetic progressions of length . But the nonstandard interval contains the standard integers as a subset, thus the integers can now also be -coloured without any monochromatic arithmetic progressions of length , contradicting Theorem 12.

As another example, we can relate qualitative and quantitative results in algebraic geometry. For instance, as noted previously on this blog, the following basic result in algebraic geometry,

Theorem 14 (Qualitative decomposition into varieties)Every algebraic set over an algebraically closed field can be decomposed into finitely many algebraic varieties.

is equivalent to the following more quantitative version:

Theorem 15 (Quantitative decomposition into varieties)Every algebraic set of complexity at most over an algebraically closed field can be decomposed at most algebraic varieties, each of complexity at most , where depends only on .

(For a definition of complexity see the previous blog post.)

Clearly Theorem 15 implies Theorem 14. To show the converse implication, suppose that Theorem 15 failed, then there exists such that for every standard natural number there exists an algebraic set of complexity at most over some field that cannot be decomposed into fewer than algebraic varieties. We use the Bolzano-Weierstrass theorem for ultrapowers to extract a subsequence that converges jointly elementarily to some limit . As each of the are algebraically closed fields, the elementary limit is also. As the were algebraic sets over of uniformly bounded complexity, is an algebraic set over , and thus by Theorem 14 is decomposable into at most algebraic varieties for some finite . The property of being decomposable into at most algebraic varieties can be phrased in an elementarily open manner, i.e. as the disjunction of sentences in first-order logic; this is essentially established in the previous blog post, with the key point being that any top-dimensional component of an algebraic set has a lesser degree than that of the original set, and so has a uniform bound on complexity. Thus, we see that for all sufficiently large , must also be decomposable into at most algebraic varieties, a contradiction.

Our final example, namely the Furstenberg correspondence principle, is a bit more sophisticated. Here, we are demonstrating the equivalence of the following two statements:

Theorem 16 (Furstenberg recurrence theorem)Let be a measure-preserving system, and let have positive measure. Let . Then there exists such that has positive measure.

Theorem 17 (Szemerédi’s theorem)Every set of integers of positive upper density contains arbitrarily long arithmetic progressions.

It is easy to use Theorem 17 to show Theorem 16, so we focus on the reverse inclusion. Suppose that Theorem 17 failed, then there exists a standard integer , a standard real , and a standard set of integers of positive upper density at least that has no arithmetic progressions of length . In particular, for arbitrarily large standard , there exists a subset of of density at least without any progressions of length . Applying the overspill principle, there thus exists an unbounded nonstandard and a nonstandard subset of of density at least without any progressions of length .

Let be the collection of all nonstandard subsets of . (Note that not every subset of a nonstandard set remains nonstandard; it may instead merely be an external subset. See this previous blog post for further discussion.) This is a Boolean algebra. From countable saturation we see that this Boolean algebra has the following special property: if any set in this Boolean algebra is partitioned into an (externally) countable family of further elements in this Boolean algebra, then all but finitely many of the are empty. For if this were not the case, then by the axiom of choice, one could find a subsequence and a set of elements of . Passing to a further subsequence, we can assume that the converge elementarily to a limit . But then this limit lies in but not in any of the , a contradiction.

From the above property, we see that any (external) finitely additive measure on is automatically a premeasure, and thus by the Hahn-Kolmogorov extension theorem, can be extended to a countably additive measure on the measure-theoretic completion of the (external) -algebra generated by .

In particular, if we consider the nonstandard normalised counting measure

on and take its standard part,

this is a finitely additive probability measure on , and hence extends to a probability measure in , which we will continue to call . This measure is known as the *Loeb measure* on the nonstandard set . Observe that any nonstandard subset of of infinitesimal density will have Loeb measure zero. On the other hand, the set had density at least , and so will have Loeb measure at least also.

Next, we define the shift by , leaving undefined for . But observe that has an infinitesimal density, hence has Loeb measure zero. So is defined almost everywhere, and is easily seen to be measurable and measure-preserving; it has an (almost everywhere defined) inverse that is also measurable and measure-preserving. Thus with Loeb measure and the shift becomes a measure-preserving system. Applying Theorem 16, we can thus find a standard such that has positive Loeb measure, so contains a -term arithmetic progression, a contradiction.

Remark 3As stated, the measure space structure on is not separable (i.e. countably generated) or regular (coming from a metric space). However, this can be fixed by restricting attention to the much smaller -algebra generated by and its shifts (after dealing with the null sets on which is not defined, e.g. by cutting out the neighbourhood of ). We omit the details.

** — 4. The Szemerédi regularity lemma — **

Finally, we use nonstandard analysis to give a proof of the Szemerédi regularity lemma, which we phrase as follows:

Lemma 18 (Szemerédi regularity lemma)Let be a finite graph, and let . Then there exists a vertex partition with such that for all pairs outside of a bad set with , there exists a density such thatfor all and , where , viewing as a symmetric subset of .

Here we do not make the cells in the partition of equal size, as is customary, but it is not too difficult to obtain this additional property from the above formulation of the lemma. The ability of nonstandard analysis to establish regularity lemmas was first observed by Elek and Szegedy.

An application of the Bolzano-Weierstrass theorem for ultraproducts shows that this lemma is equivalent to the following nonstandard version:

Lemma 19 (Szemerédi regularity lemma, nonstandard formulation)Let be a nonstandard finite graph, and let . Then there exists a vertex partition , where is a standard natural number and are nonstandard subsets of such that for all pairs outside of a bad set with , there exists a standard density such thatfor all and .

To see why Lemma 19 implies Lemma 18, suppose that Lemma 18 failed. Then there is a standard such that for every standard one could find a standard finite graph which could not be regularised into or fewer cells as required by the lemma. Applying the Bolzano-Weierstrass theorem for ultraproducts, we can assume that converges elementarily (relative to ) to a limit , which is then a nonstandard finite graph which cannot be regularised into any standard finite number of cells. But this contradicts Lemma 19.

It remains to prove Lemma 19. Let be Loeb measure on (as constructed in the previous section), and be Loeb measure on . It is easy to see that contains the product -algebra as a subalgebra, and that the product measure is the restriction of to . The edge set , viewed as a symmetric nonstandard subset of , is measurable in , but is not necessarily measurable in . One can then form the conditional expectation , which is a -measurable function that is defined up to -almost everywhere equivalence, and takes values in .

The -algebra is generated by product sets of -measurable functions, which in turn can be approximated in measure to arbitrary accuracy by product sets of nonstandard sets. As is -measurable, we can approximate it to less than in norm by a finite linear combination of indicator functions of products of nonstandard sets. Organising these products, we thus see that

for some finite partition of into nonstandard sets and some standard real numbers . By Markov’s inequality, we thus see that

for all outside of a bad set with

Now let and be nonstandard sets, with outside of . Then

On the other hand, is orthogonal to all functions, and in particular to , and thus

Since

and

and

we thus see that

Thus has been regularised using a finite number of cells, as required.

## 53 comments

Comments feed for this article

27 November, 2010 at 2:10 am

noneThanks for this great post, it demystifies a topic I’ve wondered about for while. There are several typos that are a bit hard for me to enter here since I can only have one large window on this screen. But it says R_M in a few places where it should say R_{\mathfrak U}, I think.

“(In fact, there is an enormous Galois group {\hbox{Gal}({\bf C}/\overline{{\bf Q}})}, the action of which is completely undetectable with regards to absolute elementary convergence is completely unable to detect.) ” is garbled.

One or two others that I remember but am not seeing now. I’ll try to find them on a re-read.

[Corrected, thanks – T.]27 November, 2010 at 3:18 am

gowersI’m very much enjoying the post. A very minor typo: in the fourth paragraph you meant to say slightly

lesswell known.[Corrected, thanks – T.]27 November, 2010 at 7:04 am

Anonymoushow do we know that is not converging to a rational number?

[Because is known to be irrational, and limits in the real numbers are unique – T.]28 November, 2010 at 6:30 am

Greg GravitonConcerning Furstenberg’s recurrence theorem, I take it that you define the topology on as follows: a basis of open sets is given by the nonstandard subsets of (actually, these sets are also closed).

But: a measure preserving system is usually assumed to be second countable, whereas there are uncountably many nonstandard subsets of .

If I remember correctly, your usual remedy to this problem was to restrict attention to the sigma-algebra generated by the shifts of .

28 November, 2010 at 9:28 am

Terence TaoThis is correct: at present the measure space is neither seperable nor metrisable, but as you say this can be fixed by cutting the sigma algebra down to size. Technically, the statement of the Furstenberg recurrence theorem is valid for arbitrary measure-preserving systems, though for technical reasons, the proof of the theorem usually proceeds by first reducing to the case of regular measure-preserving systems in which the sigma algebra can be identified with the Borel sigma algebra of a Polish space, in order to use tools such as disintegration. I’ve added a remark to indicate this.

29 November, 2010 at 12:00 pm

Concentration compactness via nonstandard analysis « What’s new[…] in this previous blog post (and see also this later post on ultralimit analysis, as well as the most recent post on this topic). Very briefly, we will need to fix a non-principal ultrafilter on the natural numbers. Once one […]

30 November, 2010 at 10:26 am

Jonathan WeinsteinI had some trouble understanding your non-standard definition of uniform continuity (Exercise 3.) I tried to write down my own non-standard versions of continuity and uniform continuity: Let be the set of non-negative infinitesimals (less than every positive standard real.) Then continuity is, in shortest form,

or in longer form

where the epsilon and delta quantifiers come in a non-standard, but perhaps more intuitive order: "If x only moves a tiny bit, so does f(x)."

Now, if I did this right, uniform continuity would arise by switching the order of quantifiers, similarly to standard except the delta instead of epsilon quantifier gets switched with x:

By contrast, I can't see where your definition distinguishes the order of quantifiers. I understand to be shorthand for but I feel like this shorthand is obscuring what the infinitesimal is allowed to depend on. Assuming your intended meaning is correct, maybe there is just some convention here I don't know; I have read about non-standard analysis but never worked with it. I will highly appreciate hearing about any mistakes I made here.

30 November, 2010 at 10:39 am

Terence TaoIn general, nonstandard analysis largely eliminates the need to carefully order one’s quantifiers as is the case in standard analysis, but in return, one must now carefully distinguish between what objects are standard and what objects are nonstandard.

Continuity is equivalent to the property that for all

standardreals and non-standard reals . Uniform continuity is equivalent to the same property , but now where and arebothpermitted to be non-standard.It may help to think of a nonstandard real as an (ultra)limit of a sequence of standard reals , with a non-negative infinitesimal in being the (ultra)limit of a sequence of non-negative standard reals that converges (in the standard sense) to zero. In standard analysis, continuity is equivalent to the claim that if is a sequence of (standard) real numbers and is a (standard) real number such that as , then as . Uniform continuity is equivalent to the claim that if are two sequences of (standard) real numbers such that as , then as . Viewed in this way, we see that the nonstandard definitions of continuity and uniform continuity are basically the ultralimits of the standard definitions.

More generally, a rough first approximation to nonstandard analysis can be obtained by introducing an additional parameter , and interpreting “standard” as “independent of ” and “nonstandard” as “allowed to depend on “. (Similarly, “infinitesimal” becomes “going to zero as “, and so forth.) This is only a rough approximation because one also needs an ultrafilter to decide what sets of are more “important” than others in order to decide whether a given nonstandard object has a given property. (e.g. is the quantity positive or negative in the nonstandard universe? It depends on whether the even numbers or the odd numbers are contained in the ultrafilter.)

In any case, it is an excellent exercise to try to formally establish the equivalence of the standard and nonstandard formulations of both concepts from first principles (i.e. using the definition of ultralimit); using underflow or countable saturation is the slickest way to proceed, but it is instructive to do it “with bare hands” as well, to see how the quantifiers are being automatically absorbed by the ultralimit process.

30 November, 2010 at 11:46 am

Jonathan WeinsteinThanks, that’s extremely helpful! Part of the reason my intuition was messed up was that I forgot non-standard reals can be infinite as well as infinitesimal. So I didn’t notice the key idea that a prototypical non-uniformly continuous function like exp(x) is actually not even continuous at infinite non-standard reals. Likewise, I see that f(x)=1/x restricted to (0,1] is continuous at standard reals but discontinuous at infinitesimals.

I look forward to trying to read the rest of the article…

30 November, 2010 at 11:54 am

Terence TaoYes, this is correct. On a compact (metric space) domain, every nonstandard point is infinitesimally close to a standard point (this is basically the Heine-Borel theorem), and so uniform continuity and continuity become the same concept on such domains (as is already well known in standard analysis); but on non-compact domains this is no longer the case.

(One little subtlety though: while functions such as and are not uniformly continuous in the sense that holds for all nonstandard , they are still continuous in the

internalsense, in that for every nonstandard and nonstandard there exists a nonstandard such that whenever is nonstandard with . Indeed, from the transfer principle, any continuous function f will become internally continuous when transferred to the nonstandard world. But this notion of continuity is distinct from the more “external” notion of continuity that involves the external concept of an infinitesimal.)2 December, 2010 at 11:07 am

katzThanks for a great post. There is a small detail I don’t follow: you require countability of the _constants_ which appears to make the construction inapplicable to R, an important case :)

2 December, 2010 at 11:23 am

Terence TaoUnfortunately I believe a restriction of this form is necessary in order to get elementary completeness (which, once one allows uncountably many constants, is subtly different from countable saturation, which does indeed hold with arbitrary constants). For instance, consider the sequence in the hyperreals . With at most countably many constants, one can extract a subsequence that is elementarily convergent to some (unbounded) hyperreal limit . But if one allows every single hyperreal as a constant, then it is impossible for any subsequence to converge elementarily to a limit , because the predicate (which uses as a constant) would be false for each of the but true for .

(I do not know, though, if the hyperreals are elementarily complete if one allows all the standard reals (but not the hyperreals) as constants.)

2 December, 2010 at 7:38 pm

katzCan one view the discussion of elementary completeness as taking place with Q* as the backdrop, or must the constants be left unspecified and dependent on the application? Incidentally, as per the discussion of continuity versus uniform continuity, there is a parallel distinction between convergence and uniform convergence which happens to be the background necessary to understand Cauchy’s 1853 paper on the sum theorem.

3 December, 2010 at 1:20 pm

John RoodCan you say anything useful about nilpotent infinitesimals, aka differential forms etc?

http://en.wikipedia.org/wiki/Differential_%28infinitesimal%29

Maybe these are simply outside your development here?

3 December, 2010 at 1:35 pm

Terence TaoOne can certainly use the hyperreals to construct rings with nilpotent infinitesimals via a quotient space construction. For instance, starting with a nonstandard infinitesimal in the hyperreals , we can construct the space of hyperreals of magnitude bounded by for every standard . This is an ideal of , the space of bounded hyperreals, and so one can form the quotient space , which is a commutative ring containing the standard reals as a subring for which . Any standard differentiable function extends to a function , obeying the Newton approximation for any standard . One can use this as a rigorous basis for differential infinitesimal calculus if desired.

3 December, 2010 at 2:46 pm

John RoodHmmm. Very good. But let me stick my neck out a bit farther.

Are we then talking about something like embedding the exterior algebra into some multivariable version of the nonstandard reals?

One point I wish to make is that (cf the wikipedia link in my previous post) the standard topos models my involve intuitionistic logic in which, say, the law of the excluded middle fails. I presume you don’t have such models in mind … ?

5 December, 2010 at 10:03 am

katzJohn: the hyperreals are commutative, so you won’t get very far with the exterior algebra here. Also, the hyperreals are built in the context of ZFC with classical logic, and in this sense is consevative. Lawvere’s approach, while it has its advantages, necessitates jettisoning traditional set theory as well as classical logic, but results in a surprisingly “clean” presentation of certain topics whose classical treatment is more cumbersome.

6 December, 2010 at 6:41 am

John RoodThank you for this reply. Of course they’re commutative, sorry.

6 December, 2010 at 7:20 am

John RoodDear katz, thinking about this a bit more, I find I still have a question. But, this leads me to a question of “protocol” on this blog, namely, is there some way I can take this conversation out of this context, say by contacting you directly by email or something. Or you can contact me (my email address is known to the blog, of course).

It occurs to me that one should (?) (or might?–where does the following go wrong) have vectors over the nonstandard reals, then vector spaces, then algebras like the tensor algebra, Lie algebras AND the exterior algebra. All of these things would be carrying along with them “little differentials”, i.e. infinitesimals. Does this make any sense? Then one might hope to recover the “Lie Bracket” by a standard geometric argument (about going around the sides of an “infinitesimal” parallelogram which doesn’t commute) which produces a new vector (field) from two input vector (fields) as a sort of first order approximation (infinitesimally) of this geometric non-commutativity.

As I say, I’d be interested to hear what you have to say / know about this, and where it goes wrong … ?

6 December, 2010 at 7:53 am

katzI think all of the items you mentioned should go through without problem. The hyperreal approach to Lie groups and algebras is already in Robinson I think. Actually I was unable to find your email address via this site. You can contact me by looking for first name Mikhail. I prefer not to leave an email address here so as to minimize robot-generated mail. Otherwise we can continue here; I am sure Terry will let us know if we overuse his hospitality :)

6 December, 2010 at 10:43 am

John RoodOk, tyvm. Let me mention that I do now have your email address with a tip o’ the hat to Terry and I presume that you also have mine.

So (and please forgive me if I am making superficial errors here) the “obvious” question is: If all this Lie Theory goes through using Robinson’s nonstandard analysis, what does one get from Lawvere’s approach? I am aware that Lie Groups are about as well behaved as one can expect any spaces to be. Maybe more general spaces … bla bla bla … ?

Is there a simple answer to this question of what is Lawvere up to? (Even a non-simple answer might be of interest …)

6 December, 2010 at 10:51 am

katzLawvere’s approach allows for a certain additional elegance (some would say, simplicity) due to the presence of nilsquare infinitesimals. To take an elementary example, if y=f(x) is differentiable then for nilsquare dx one would have dy=f'(x)dx without a higher order error term. Similarly, one can define the tangent plane as everything proportional to nilsquare infinitesimal displacements. A planetary orbit can be thought of literally as made up of infinitesimal straight line segments.

As I see it, the disadvantage of the approach is an absence of the full power of the transfer principle which is available in the hyperreal framework. I hereby challenge John L. Bell to produce an infinitesimal proof of the invariant subspace conjecture :)

6 December, 2010 at 3:00 pm

John RoodOk, very good. But I find I still have a point of confusion. Are all infinitesimals’ squares zero? Do Leibniz’ higher order infinitesimals make sense? Which kind is dx?

6 December, 2010 at 7:25 pm

katzNo, some infinitesimals are invertible and therefore not nilsquare. To have dy=f'(x)dx we must start with a nilsquare dx. Leibniz would not hear of nilsquare ones. Rather, they are in the spirit of his critic Nieuwentijdt. A great account of all this is in John L. Bell’s article in the Stanford Encyclopedia of Philosophy.

7 December, 2010 at 5:55 am

John RoodYes! Very nice way into this topic, apparently:

http://plato.stanford.edu/entries/continuity/

Here is a quotation from the section about Smooth Infinitesimal Analysis:

for all infinitesimal ε, not ε ≠ 0 [endquote]

Evidently not compatible with the Law of the Excluded Middle …

Amazing how the “intuitive” history of mathematics (analysis) can be formalized! And the internet makes all of this easier to access.

katz, thank you very much for your discussion here.

14 December, 2010 at 11:43 am

noneSo I wonder to what extent this nonstandard stuff breaks the “standard” proofs in calculus, that depend on the least-upper-bound axiom, which fails in this non-Archimedian version of the reals. The nonstandard reals are elementarily equivalent to the standard reals, i.e. they satisfy the same first-order sentences, but the least-upper-bound property is second-order and can tell the two models apart. I didn’t know any logic when I took calculus but I guess this means that my “rigorous” calculus course was actually done in second-order logic, which means(?) that its proofs relied on a tacit, informal version of set theory under the covers. That gap got taken care of later in logic class, where we went over an explicit encoding of the reals in ZFC, which meant that the calculus proofs worked because we could treat second-order statements about reals as first-order statements about sets. But to develop calculus with nonstandard reals, it looks like you have to do all the proofs over again; you don’t get to add new, easier proofs while still keeping the old ones. Does that look right? I’ve never really thought about this before.

15 December, 2010 at 10:05 pm

katzIn response to the comment by “none”: second order statements remain valid if interpreted as applying to internal sets, in particular natural extensions of real sets. Thus, the least upper bound will exist, etc. For instance, the natural extension of the open interval (0,1) will contain all decimal expansions starting with infinitely many 9s, and the least upper bound is 1.

15 December, 2010 at 11:57 am

AlbaniusIt would be very helpful to see this as a pdf.

My attempts to print from this page or copy to MS-Word did not capture the symbols.

15 December, 2010 at 12:23 pm

Jonathan WeinsteinAlbanius,

Prof. Tao provides valuable, informative and entertaining content, with higher quality than most published material, for free. I would hate to complain about the file format.

15 December, 2010 at 1:03 pm

AlbaniusIt is precsiely because the content is so valuable that I would like to be able to print it out to study.

15 December, 2010 at 4:00 pm

AnonymousAlbanius, the AMS has been publish Prof. Tao’s in printed book form once per year. See for example here: http://www.ams.org/bookstore-getitem/item=MBK-59

18 December, 2010 at 10:10 am

anonymous2In Firefox choose File > Print and it formats the blog post nicely and includes the “special” characters (i.e. TeX). WordPress does that by default.

18 December, 2010 at 10:24 am

AlbaniusNewbie mistake, sorry.

After clicking the PRINT button between the essay and the comments,

the next window had an option for PDF, which created a pdf file I could save and print, just what I wanted.

16 December, 2010 at 9:44 am

alexTerry Tao thank you for all the beautiful mathematics, I am learning a lot reading your blog.

You said that the model of real numbers for the language of ordered fields can see algebraic numbers precisely but see transcendental numbers fuzzily. I was wondering if anyone has applied this method to prove/construct a number transcendental?

17 December, 2010 at 7:37 pm

Quinn CulverTypo in the proof of Proposition 4?:

“…is elementarily convergent, as desired.” Should be “is elementarily Cauchy, as desired.”?

[Corrected, thanks – T.]17 December, 2010 at 8:20 pm

Quinn CulverAnother potential typo: in Definition 6, “…for all sufficiently close to .” should be “…for all sufficiently close to .”

(If you’d prefer I not point out minor typos like this, let me know.)

[Corrected, thanks – T.]19 January, 2011 at 7:15 pm

Matthew N. PetersenIs there a nonstandard definition of Absolute Continuity? I haven’t been able to discover one yet.

20 January, 2011 at 12:16 pm

Terence TaoA function is absolutely continuous iff *f maps nonstandard elementary sets (i.e. a union of nonstandard-finitely many nonstandard intervals) of infinitesimal length to nonstandard elementary sets of infinitesimal length (or subsets thereof).

21 January, 2011 at 12:39 pm

Matthew N. PetersenThank you!

9 April, 2011 at 11:18 am

Non-standard analysis link round-up « Matthew’s Math Blog[…] Two posts on Terry Tao’s blog: [1] […]

15 October, 2011 at 10:58 am

254A, Notes 6: Ultraproducts as a bridge between hard analysis and soft analysis « What’s new[…] some extent the two constructions are at least partially interchangeable in this setting. (See also these previous posts for the use of ultralimits as a substitute for topological limits.) In the theory of […]

2 April, 2012 at 4:54 pm

A cheap version of nonstandard analysis « What’s new[…] of a countable family of nested satisfiable formulae is simultaneously satisfiable. (See this previous blog post for more on the analogy between the use of nonstandard analysis and the use of metric completions.) […]

9 April, 2012 at 3:58 am

sıcak videolarTerry Tao thank you for all the beautiful mathematics, I am learning a lot reading your blog.

25 October, 2012 at 10:11 am

Walsh’s ergodic theorem, metastability, and external Cauchy convergence « What’s new[…] the Loeb -algebra, called Loeb measure, such that for any internal subset of , one has . See e.g. this previous blog post for the details of the […]

27 March, 2013 at 7:34 pm

An informal version of the Furstenberg correspondence principle | What's new[…] leads to the Furstenberg correspondence principle, discussed for instance in these previous blog posts. Such principles allow one to rigorously pass back and forth between the combinatorics of strings […]

7 December, 2013 at 4:06 pm

Ultraproducts as a Bridge Between Discrete and Continuous Analysis | What's new[…] space as a “completion” of the standard space , analogous to metric completion; see this previous blog post for further discussion of this perspective. A related perspective is to view as a double-precision […]

17 July, 2014 at 12:56 am

Costas DrossosIs there any relationship of your ultralimit concept with Kochen’s ultralimits.

Do you know any concrete development of ultraroots?

I might state a conjecture: A appropriate model for Nelson’s IST is given through ultraroots in contradistinction with Robinson which uses ultrapowers. What is your opinion?

17 July, 2014 at 1:09 am

Mikhail KatzI note that ultraroots have not been mentioned on this post previously. It might be easier for other participants to follow this discussion if you explain what they are.

17 July, 2014 at 3:39 am

Costas DrossosAn ultraroot of a structure U is a structure Ultraroot(U) such that its ultrapower is U. The thing is that I have seen the definition but not a technical development. Thus the ultraroot of R will give not an extension of R like e.g. *R, but a vague deformation of it.

17 July, 2014 at 4:08 am

Mikhail KatzAt first sight it seems as though will not have any ultraroot because the ultrapower of such an would necessarily contain infinite elements which are not to be found in .

17 July, 2014 at 4:47 am

Costas DrossosTake a diameter of poincare disk, (-n,n)\cap R. then teke the ultralimits of this. then we have R! Then (-n,n)\cap R. is the ultraroot of R.

28 October, 2015 at 1:56 pm

Two old-ish Terry Tao posts about model theory | sylvy's mathsy blog[…] this which talks about nonstandard analysis, notions of `elementary convergence’ and `elementary […]

9 December, 2018 at 2:32 pm

254A, Supplemental: Weak solutions from the perspective of nonstandard analysis (optional) | What's new[…] has similar properties to the standard one, but also some additional properties. As discussed in this previous post, the relationship between the standard universe and the nonstandard universe is somewhat […]