The standard modern foundation of mathematics is constructed using set theory. With these foundations, the mathematical universe of objects one studies contains not only the “primitive” mathematical objects such as numbers and points, but also sets of these objects, sets of sets of objects, and so forth. (In a *pure* set theory, the primitive objects would themselves be sets as well; this is useful for studying the foundations of mathematics, but for most mathematical purposes it is more convenient, and less conceptually confusing, to refrain from modeling primitive objects as sets.) One has to carefully impose a suitable collection of axioms on these sets, in order to avoid paradoxes such as Russell’s paradox; but with a standard axiom system such as Zermelo-Fraenkel-Choice (ZFC), all actual paradoxes that we know of are eliminated. Still, one might be somewhat unnerved by the presence in set theory of statements which, while not genuinely paradoxical in a strict sense, are still highly unintuitive; Cantor’s theorem on the uncountability of the reals, and the Banach-Tarski paradox, are perhaps the two most familiar examples of this.

One may suspect that the reason for this unintuitive behaviour is the presence of infinite sets in one’s mathematical universe. After all, if one deals solely with finite sets, then there is no need to distinguish between countable and uncountable infinities, and Banach-Tarski type paradoxes cannot occur.

On the other hand, many statements in infinitary mathematics can be reformulated into equivalent statements in finitary mathematics (involving only finitely many points or numbers, etc.); I have explored this theme in a number of previous blog posts. So, one may ask: what is the finitary analogue of statements such as Cantor’s theorem or the Banach-Tarski paradox?

The finitary analogue of Cantor’s theorem is well-known: it is the assertion that for every natural number , or equivalently that the power set of a finite set of elements cannot be enumerated by itself. Though this is not quite the end of the story; after all, one also has for every natural number , or equivalently that the union of a finite set and an additional element cannot be enumerated by itself, but the former statement extends to the infinite case, while the latter one does not. What causes these two outcomes to be distinct?

On the other hand, it is less obvious what the finitary version of the Banach-Tarski paradox is. Note that this paradox is available only in three and higher dimensions, but not in one or two dimensions; so presumably a finitary analogue of this paradox should also make the same distinction between low and high dimensions.

I therefore set myself the exercise of trying to phrase Cantor’s theorem and the Banach-Tarski paradox in a more “finitary” language. It seems that the easiest way to accomplish this is to avoid the use of set theory, and replace sets by some other concept. Taking inspiration from theoretical computer science, I decided to replace concepts such as functions and sets by the concepts of algorithms and oracles instead, with various constructions in set theory being replaced instead by computer language pseudocode. The point of doing this is that one can now add a new parameter to the universe, namely the amount of computational resources one is willing to allow one’s algorithms to use. At one extreme, one can enforce a “strict finitist” viewpoint where the total computational resources available (time and memory) are bounded by some numerical constant, such as ; roughly speaking, this causes any mathematical construction to break down once its complexity exceeds this number. Or one can take the slightly more permissive “finitist” or “constructivist” viewpoint, where any finite amount of computational resource is permitted; or one can then move up to allowing any construction indexed by a countable ordinal, or the storage of any array of countable size. Finally one can allow constructions indexed by arbitrary ordinals (i.e. transfinite induction) and arrays of arbitrary infinite size, at which point the theory becomes more or less indistinguishable from standard set theory.

I describe this viewpoint, and how statements such as Cantor’s theorem and Banach-Tarski are interpreted with this viewpoint, below the fold. I should caution that this is a conceptual exercise rather than a rigorous one; I have not attempted to formalise these notions to the same extent that set theory is formalised. Thus, for instance, I have no explicit system of axioms that algorithms and oracles are supposed to obey. Of course, these formal issues have been explored in great depth by logicians over the past century or so, but I do not wish to focus on these topics in this post.

A second caveat is that the actual semantic content of this post is going to be extremely low. I am not going to provide any genuinely new proof of Cantor’s theorem, or give a new construction of Banach-Tarski type; instead, I will be reformulating the standard proofs and constructions in a different language. Nevertheless I believe this viewpoint is somewhat clarifying as to the nature of these paradoxes, and as to how they are not as fundamentally tied to the nature of sets or the nature of infinity as one might first expect.

** — 1. A computational perspective on mathematics — **

The great advantage of using set theory in mathematics is that all objects in a given set (e.g. all numbers in the real line ) are available to you at all times; one can take one, many, or all objects in a set and manipulate them as one wishes (cf. the axiom schema of replacement); similarly, one can assign a truth value to a statement that quantifies over an arbitrary number of objects. If one removes sets from the picture, then one no longer has immediate access to arbitrary elements of a set, and one can no longer perform operations *en masse* on all the elements of a set at once; instead, one must use some (possibly more restrictive) protocol for manipulating objects in a class, or verifying whether a given statement is true or false. (In more philosophical terms, we are focusing more on an epistemological approach to mathematics, based on what we can measure and query, as opposed to an ontological approach, based on what we believe to exist.)

For this, it is convenient to use the conceptual framework that is familiar to us through modern computer languages, such as C. In this paradigm, when dealing with a class of objects (e.g. integers), we do not get access to the entire set of integers directly. Instead, we have to declare a integer variable, such as , and set it equal to some value, e.g. ; or, if one is creating a routine that takes input, might be initialised to one of the unspecified inputs of that routine. Later on, we can use existing variables to define new ones, or to redefine existing ones, e.g. one might define to equal , or perhaps one can increment to . One can then set up various loops and iterations to explore more of the parameter space; for instance, if countably infinite loops are permitted as a computational resource, then one can exhaust the positive integers by starting at and incrementing by indefinitely; one can similarly exhaust the negative integers, and by alternating between the two (and also passing through ) one can exhaust the entire integers by a countably infinite loop. This of course is just the standard demonstration that the integers are countable.

Real-world computers, of course, have finite limits of precision; they cannot represent arbitrarily large integers, but only integers up to a certain size (e.g. or ). One could think about computational models with such a strict finitary limitation, but let us assume that we are in a more idealised setting in which there are no limitations on how large an integer one can store. Let us then make the even more idealised assumption that we can also store real numbers with unlimited precision; our computer never makes any roundoff errors. (Note that there are indeed arbitrary precision models of computation that can do this, though the catch is that speed of computation depends heavily on how complicated it is to describe any given number.)

Remark 1A technical point: it may be that the computational model of the real numbers is different from the standard real line ; for instance, perhaps the computer only implements “constructible” real numbers, which is for instance the case in physical arbitrary precision computers. We will touch upon this point again later.

Note that if one were to expand out a given real number, say , as a decimal expansion, then one would obtain an infinite string of digits. But, just as we do not have direct access to the set of all integers, we will not have direct access to the entire decimal expansion of . If we have a natural number , we are allowed to inspect the digit of by making a suitable function call (e.g. ), and if we are allowed to set up an infinite loop in which starts at and increments indefinitely, one can exhaust all the digits of , which is good enough for most “practical” mathematical purposes. (For instance, if one were allowed to run programs of countable length using real arithmetic, one could make a program that determined whether was normal or not, or to determine the truth of the Riemann hypothesis, or more generally to compute the truth-value of any first-order sentence in the theory of the real line.)

We assume that the usual arithmetic operations can be performed on real numbers in reasonable amounts of time. For instance, given two real numbers , one can determine whether they are equal or not in finite time (consulting some sort of “equality oracle” for the reals if necessary). Note that equality can be a subtle issue; if one thinks of real numbers as infinite strings of digits, their equality can only be verified directly by using a countable amount of computing power. But we will sidestep the issue of how exactly the reals are implemented by simply assuming that enough oracles exist to perform real arithmetic at an acceptable computational cost.

As already hinted at in the above discussion, we are assuming that our computer has access to a certain amount of computing resources (e.g. time, memory, random number generation, oracles). We will be rather vague on exactly how to formalise the concept of a resource, but basically the standard definitions used in computer science would be a good approximation here, at least when the resources are finite. But one can consider allowing for certain computational resources to be infinite in some carefully controlled manner; for instance, one could consider a situation in which countably infinite loops are permitted (provided that all variables in the loop that one wants to retain “converge” in some sense at the end of the loop), but for which uncountable loops are not allowed. We will not formalise such concepts here (but roughly speaking, they correspond to allowing transfinite induction up to some specified ordinal, and no further).

We will not be using sets or set-theoretic functions in this computer language. However, we will use as a substitute the concept of an oracle – a “black box” routine that takes zero or more variables of a given class as input, and returns zero or more variables in various classes as output (usually we will have just a single output, but it will be convenient to allow multiple outputs). Being a black box, we do not know how the oracle obtains the output from the input, but we are able to use the oracle anyway. Let us assume that each invocation of an oracle takes some acceptable amount of time (e.g. bounded time when the computational resources are finite, or countably infinite time if countable time resources are allowed, etc.). All our oracles are *consistent* in the sense that they always produce the same output for a fixed choice of input; thus, if one calls the oracle again at some later time with the same input, then the oracle will return the same output as it did before. It is important to note that consistency is a subtly weaker assumption than requiring the oracle is *non-adaptive*; we allow the oracle to “remember” previous queries, and to use that memory to formulate answers to later queries, as long as it does not contradict the outputs it gave previously.

We will be concerned primarily with *membership oracles* – an oracle that takes a variable in a given class and returns an answer that is either “Yes” or “No”. Informally, the oracle is describing some subset of this class, and is answering questions regarding whether any given variable lies in this set or not. Note, however, that this set only exists in a “virtual” or “potential” sense; if the oracle is adaptive, the set of inputs that will give a “Yes” answer may not yet be fixed, but could depend on future queries to the oracle. If the class can be exhausted within the computational resources permitted (e.g. if the parameter space can be countably enumerated by the computer, and countably infinite loops are permitted), then one can query the oracle for every single element and thus pin down the set completely (though one may not be able to *store* this set if one does not have sufficient memory resources!), but if the parameter space is too large to be exhausted with the available resources, then the set that the oracle is describing will never be completely described.

To illustrate this, let us briefly return to the traditional language of set theory and recall the following textbook example of a non-measurable subset of the reals . This set is constructed by partitioning into cosets of the rationals , and then using the axiom of choice to selecting a single representative of each coset; the set is the collection of all such representatives. Thus the rational translates , partition , and it is not hard to then deduce (using the properties of Lebesgue measure) that cannot be Lebesgue measurable.

We can simulate this non-measurable set by an *adaptive* oracle , which remembers all prior queries to itself, and works as follows:

- takes a real number as input.
- If has previously answered the question of whether lies in , repeat whatever answer was given previously.
- If has not been queried before, and if furthermore no rational translate of has been queried before either (i.e. differs from all previous queries by an irrational number), then answer with “yes”.
- Finally, if has not been queried before, but has been queried before for some rational , then answer with “no”.
- Store (and ) in memory for use in future queries.

Assuming one has a rationality oracle that can tell (in bounded time) whether a given real number is rational or not, then is a perfectly programmable oracle, which will run in finite time whenever one asks only a finite number of queries of it. (After querying the oracle an infinite number of times, though, it will require an infinite search loop in order to make sure that any subsequent answer it gives is consistent with all previous answers, unless one is allowed to use an infinite amount of memory, e.g. an array indexed by the quotient space .) It is even completely deterministic – it requries no arbitrary choices on the part of the oracle. If one had the patience to query this oracle for every single real number (which would, of course, require an uncountable number of queries), the oracle would eventually describe completely the non-measurable set . But if one is only permitted a finite or countable number of queries, then the non-measurable set only exists in some “virtual” sense.

More generally, non-adaptive oracles tend to generate measurable sets, while adaptive oracles are likely to generate non-measurable sets. So we see that non-measurability does not have to be viewed as a quirk arising from the nature of infinity, or from the axiom of choice; it can be viewed instead as the freedom to adapt to previous queries to membership of the set, which is a concept that makes sense even in a strictly finitist setting.

Remark 2One can think of an non-adaptive oracle as being like a truthful observer, reporting on some objective set that exists independently of the queries, while an adaptive oracle is more like a pathological liar, inventing a previously non-existent set on the fly as needed in order to consistently answer the questions posed of it. It is then not so surprising that the set thus invented is likely to be non-measurable. We thus see that the ability of oracles to adapt is somewhat analogous to the role the axiom of choice plays in traditional set theory.

We will discuss measurability and non-measurability in more detail a little later in this post.

We have seen that membership oracles are sort of like “virtual” sets. Many set operations can be simulated for membership oracles. For instance, given two membership oracles that apply to variables in some class (e.g. the real numbers), one can form the union oracle , which works by querying both and and returning “Yes” if at least one of and was “Yes”, and “No” otherwise. More generally, any finite boolean operation of membership oracles gives another membership oracle, and (if countable computational resources are available) the same is true for countable boolean operations also. As one increases the amount of computational resources available, more and more set-theoretic operations become available, and when one allows unlimited resources (or more precisely, transfinite induction up to any ordinal, and storage of arbitrarily sized infinite sets), then all the standard operations in set theory (e.g. invocations of the axiom schema of replacement, the power set axiom, the axiom of choice, etc.) become available.

Remark 3Membership oracles only describe a raw, unstructured set. If one wants to place additional structure on a set (e.g. measure structure, topological structure, smooth structure, etc.) then additional oracles would be needed. Of course, the same is true in traditional set theory; for instance, to place a topology on a set one also needs to specify a collection of open sets on obeying the axioms of a topology, and similarly for other structures one can place on sets. We will see an example of these additional oracles later in this post, when we revisit the concept of measurability.

Remark 4Membership oracles are weaker than sets in many ways. One of these comes from a breakdown of the law of the excluded middle. In set theory, a statement about a set is either true or false; for instance, is either finite or infinite. If one is instead given an adaptive membership oracle , questions such as whether describes a finite set or not are undecidable if one only has finite computational resources. However, one can imagine strengthening the membership oracle to a oracle that, in addition to answering questions about membership of individual elements, will also answer more general questions about (such as whether is finite), in such a fashion that all the answers given are consistent with each other. In such a way, the law of the excluded middle can be restored; but then programming an adaptive oracle in a way that keeps all the answers consistent becomes quite a challenge. (Note though that the Gödel completeness theorem asserts, in some sense, that this is always possible, provided that one’s initial constraints on are originally consistent.)

** — 2. Cantor’s theorem — **

Cantor’s theorem (and its proof) transfers easily enough from sets to oracles. The analogue of a (proposed) enumeration of the real numbers is an “enumeration oracle” that takes a natural number as input, and returns a real number as output; we allow repetitions. The Cantor diagonal argument shows that given any such putative enumeration , and given access to a countable amount of computing resources, one can construct a real number which is guaranteed not to be covered by the enumeration oracle; for instance, one can construct to be a string of decimals , with chosen to be the first digit in (say) not equal to the first digit of , chosen to be the first digit of not equal to the second digit of , and so forth. (I exclude the digits and to avoid the technical and irrelevant issue.)

This is of course virtually identical to the usual proof of Cantor’s theorem. But the proof highlights that in order to exhibit a counterexample to the claim that enumerates the reals, one needs a countably infinite amount of computational resources. And indeed if one works in a finitary computational model in which one is only allowed to run programs that are guaranteed to halt, and if one limits one’s real number system to the “finitely constructible” reals – those reals that can be generated by halting programs – then one can indeed enumerate the “real numbers” in this system, by enumerating all the possible programs to generate real numbers as , and computing to be the output of , where is the first integer such that halts in less than steps, while outputting a number different from (setting if no such integer exists). In this model, the real number given by Cantor’s argument is not finitely constructible, but only countably constructible. It is only because one has access to countably infinite computational resources (in particular, the ability to sum convergent infinite series such as ), that the reals are no longer countable.

Let us now consider a slightly different version of Cantor’s theorem, which in the language of set theory asserts that for a given set (e.g. the natural numbers ), that one cannot enumerate the power set by itself. In order to phrase this using the language of oracles rather than sets, one needs to be able to treat oracles themselves as variables; note that many modern computer languages (particularly functional languages such as Lisp and Haskell) already do this. In particular, we allow for the existence of oracles that generate further oracles as output, or themselves take oracles as input.

A proposed enumeration of the power set of the natural numbers (say) by the natural numbers themselves can then be viewed as an enumeration oracle which takes a natural number as input, and returns a membership oracle in the natural numbers as output; thus itself is able to accept a natural number as input and return a yes-or-no answer as output.

Cantor’s diagonal argument then asserts that given any proposed enumeration oracle of the above form, one can always find a membership oracle which is not enumerated by . Indeed, one simply sets to be the negation of for any given natural number input .

With this version of Cantor’s argument, we no longer need a countably infinite amount of computational resources; the above argument is valid even in the finitary computational model. In that model, the argument shows that the class of oracles that terminate in finite time cannot itself be enumerated by an oracle that terminates in finite time; this is essentially Turing’s halting theorem. Thus we see that Turing’s theorem and Cantor’s theorem are simply different cases of a single general theorem, the former in the case of finitary computational resources and the latter in the case of unlimited computational resources. (See also my previous blog post for further discussion of these “no self-defeating object” arguments.)

Also observe that this version of Cantor’s argument works if the natural numbers are replaced by any other class of variable; for instance, in the finite class of integers between and , the argument demonstrates that the membership oracles in this class cannot be enumerated by the numbers from to itself, thus .

Let us now discuss the analogous situation with the inequality . The claim is now that if is a finite class, and is a strictly larger class (containing at least one additional element outside of ), and one is given an enumeration oracle that takes an variable in as input and returns a variable in as output, then one should be able to find a variable in or equal to which is not covered by the enumeration oracle.

One way to find such a is by the following algorithm, which is just a painfully disguised version of the proof that using induction. Let be an element in that is not in . We first search through all the elements of to see if there is a solution to the equation . If no solution exists, then we are done; we can just take . So suppose instead that we have some found some for which . Then we can delete , and all other solutions to , from the domain of the oracle , and also delete from the output of the oracle, giving rise to a modified oracle which takes as input a variable with , and returns the output . Note that if we ever find a variable in the range of that is not enumerated by , then will not be enumerated by either. So we can descend from to the oracle , which has a smaller domain and range. Also note that the range remains strictly larger than the domain, as lies in the range but not in the domain. We can thus keep iterating this procedure; since we cannot have an infinite descent, the algorithm must eventually terminate. Unpacking the termination condition, we have indeed produced an element in the range of which was not enumerated by .

We observe that this algorithm only halts due to the principle of infinite descent, which is only valid when the initial class one starts with was finite. For infinite classes, which admit infinite descents, one can of course find surjective enumerations between such classes and strictly larger classes; for instance, the enumeration oracle that sends a natural number to is an enumeration of by . In contrast, the proof of Cantor’s theorem does not rely on facts specific to finite classes, such as the principle of infinite descent, and is thus valid in arbitrary classes.

** — 3. Measurability revisited — **

In a previous section, we gave an example of a membership oracle in the real line which was non-measurable, in the sense that the set that it was (virtually) describing necessarily failed to be Lebesgue measurable. But the concept of Lebesgue measurability was still defined in the context of set theory, and not in a purely oracle-theoretic language. It is then natural to ask whether one can define Lebesgue measurability purely in the context of computational models, and in particular whether one can discuss this concept in a finitary computational model.

For simplicity let us now work in a bounded subset of , such as the unit interval , so that we can work with a finite measure rather than a -finite measure.

From classical measure theory, we recall the following characterisation of Lebesgue measurability: a subset of the unit interval is Lebesgue measurable if for every , one can find an elementary subset of the unit interval (i.e. a finite union of open intervals) whose set-theoretic difference with has outer measure less than , or equivalently that it can be covered by a countable collection of intervals whose length adds up to less than .

Thus, if one wants a membership oracle to “certify” its measurability, one way to do so is to provide an additional “measurability oracle” , which takes a positive real number as input, and returns three outputs:

- A description of an elementary set (which can be done in a finite amount of time and space, by specifying the number of intervals used to form , together with their endpoints);
- An interval oracle , that for each natural number input returns an interval ;
- A covering oracle , which for each real number in , returns a natural number for which . (Actually, this oracle is redundant, as it can be simulated from the previous two outputs by a finite brute force search.)

With these oracles, one can then verify the measurability of by selecting an , and verifying firstly that after selecting any natural number , the sum of the lengths of remains less than , and secondly that after selecting any real number , that if lies in but not in or vice versa, that indeed lies in . In principle, if one performed this verification procedure an uncountable number of times (once for each choice of ) one would fully demonstrate the measurability of ; but if one instead only had access to finite or countable computational resources, then one could only verify measurability on an “as needed” basis.

So, in the oracle model, a measurable set is not simply a membership oracle ; it must also be supplemented with an additional measurability oracle that “witnesses” the measurability. This is analogous to how sets must be augmented with (say) topological structure if one wants to perform topology on that set, or algebraic structure if one wants to perform algebra on that set, etc.

If one possesses a measurability oracle for a set (or more precisely, a membership oracle ), then can estimate the Lebesgue measure of to within accuracy by calling to obtain an approximant , and then computing the measure of (which can be done in a finite amount of time, as is simply a finite union of intervals). A key fact (which, not coincidentally, is crucial in the standard construction of Lebesgue measure) is that these approximations to the Lebesgue measure of are compatible with each other in the following sense: if one calls one measurability oracle for at accuracy to get one estimate for the Lebesgue measure of , and if one then calls another (possibly different) measurability oracle for the same set at another accuracy to get another estimate for , then these two estimates can only differ by at most ; in particular, sending , one obtains a Cauchy sequence and (after a countable number of operations) one can then compute the Lebesgue measure of to infinite precision.

This key fact boils down (after some standard manipulations) to the fact that an interval such as has outer measure at least ; in our oracle based model, this means that if one is given an interval oracle that generates open intervals for each natural number , in such a way that the total length of these intervals is less than , then one can find a point in which is not covered by any of the .

This can be done using a countable amount of computational power (basically, the ability to run a single infinite loop; this is roughly equivalent to the theory in reverse mathematics). The point is that for each finite , the set is a computable non-empty finite union of closed intervals in , which decreases in . The infimum can be computed infinite time for each , and increases in ; the limit is then an element in that lies in but is outside all of the . Thus, given a countable amount of computational power, one can consistently define Lebesgue measure of a measurable set, and verify its basic properties.

It is instructive to apply the above discussion to the non-measurable membership oracle given in previous sections (trivially modified to lie in rather than in ). If one is given a purported measurability oracle for this oracle , one can eventually show that this oracle does not actually certify the measurability of as claimed, but this requires a countably infinite amount of computation to establish. (Basically, there are two cases, based on whether asserts that has positive Lebesgue measure or not (which can be decided after a countable amount of computation). If it has positive measure, then by invoking with less than half this measure, one can soon find an interval in which has density greater than (or more precisely, the complement of in has outer measure strictly less than ) and then one can run a variant of the above Bolzano-Weierstrass argument to find two points in and in which differ by a rational, contradicting the construction of . If instead asserts that has zero measure, then can cover by intervals of arbitrarily small total measure, and then can do the same for the union of all the rational shifts of , and one can then find an element such that no rational shift of lies in .)

On the other hand, if one is only allowed to query and finitely many times, then one can show that one can adaptively build and in response in these queries in such a way that one never obtains a contradiction, while retaining the properties that the shifts of by the rationals partition . So a pathological liar can build a non-measurable set but claim that it is measurable; the deception can sustain any finite number of queries about the set and its measurability, but given a countable amount of queries one can eventually demonstrate that there is an inconsistency (at least for non-measurable sets coming from the coset partition of by the rationals).

Remark 5Interestingly, the type of adaptive oracles one uses to create non-measurable sets are not compatible with random number generation. If one has access to a source of random real numbers (say in ), then in principle one can (almost surely) compute the Lebesgue measure in countable time of any subset of accessed through a random oracle by the Monte Carlo method: randomly sampling points in , counting the proportion that lie in , and then sending . However this only works properly if the oracle does not adapt to the various random inputs it is given, and is instead independent of these inputs. (For instance, if one were to use Monte Carlo methods to measure the non-measurable set described above, one would almost surely get that has measure , as each random number is almost certain to lie in a different coset of !.)

** — 4. The Banach-Tarski paradox — **

We now turn to the Banach-Tarski paradox. The usual formulation of this paradox involves a partition of the unit ball into pieces that can be rearranged (after rotations and translation) to form two copies of the ball. To avoid some minor technicalities, we will work instead on the unit sphere with an explicit countable set removed, and establish the following version of the paradox:

Proposition 1 (Banach-Tarski paradox, reduced version)There exists a countable subset of and partition of into four disjoint pieces , such that and can be rotated to cover , and and can also be rotated to cover .

Of course, from the rotation-invariant nature of Lebesgue measure on the sphere, such a partition can only occur if at least one of are not Lebesgue measurable.

We return briefly to set theory and give the standard proof of this proposition. The first step is to locate two rotations in the orthogonal group which generate the free group . This can be done explicitly; for instance one can take

(See for instance this post from the Secret Blogging Seminar for more discussion of this example.) Each rotation in has two fixed antipodal points in ; we let be the union of all these points. Then the group acts freely on the remainder .

Using the axiom of choice, we can then build a (non-measurable) subset of which consists of a single point from each orbit of . For each , we then define to be the set of all points of the form , where and is a word such that

- is the identity, or begins with (if );
- begins with (if );
- begins with (if );
- begins with (if ).

It is then clear that partition , while and both cover , as claimed.

Now let us interpret this example using oracles. The free group is countably enumerable, and so with a countable amount of time and memory, one can construct and store without difficulty. A membership oracle for can then be constructed by an adaptive oracle much as in the previous sections; returns yes if is the first point in its orbit that is queried, and returns no otherwise. The oracles can then be defined by querying for all the points in ‘s orbit in some arbitrary order until one finds the point which lies in , and then deciding membership in depending on the first symbol of . (For instance, if ‘s orbit has not been queried before, the first point in the orbit that one queries for will lie in ; thus one sees that the order in which one decides to search through the orbit will in fact influence quite strongly what and the will look like.)

It is then not hard to show that do indeed partition in the sense that for each , exactly one of will return “yes”. Also, for any , at least one of will return “yes”, and at least one of will return “yes”. Thus, after performing an uncountable number of queries to fully complete all of these sets, we obtain sets obeying the properties in Proposition 1; but the oracles that do this are perfectly deterministic, and run in only a countable amount of time, at least for the first few queries (and one can probably even trim it down to a finite amount of time, if one has an efficient way of deciding whether two points lie in the same orbit of ).

Now we discuss how a Banach-Tarski type construction does not work in one or two dimensions. Let us just consider the one-dimensional case:

Proposition 2There does not exist a decomposition of the unit interval into finitely many sets such that some translates of these sets cover .

We briefly review the proof of this proposition. Suppose for contradiction that we have such a decomposition. We can colour the lattice in colours , by giving each -tuple the colour if . This is clearly a -colouring of . Furthermore, for every -tuple , we see that at has colour for at least two values of , where is the standard basis for . If one then uses double-counting to get two different estimates for the number of coloured points in a box for a large enough , one obtains a contradiction.

Note that this proof is quite finitary; given some real numbers and some membership oracles , one could convert the above argument into an algorithm that would be able to demonstrate in finite time that either the oracles fail to partition , or that the translates fail to cover ; we leave this as an exercise to the reader.

(The key distinction here between the low and high dimensional cases is that the free group is not amenable, whereas is amenable. See these previous blog posts for further discussion.)

** — 5. Summary — **

The above discussion suggests that it is possible to retain much of the essential mathematical content of set theory without the need for explicitly dealing with large sets (such as uncountable sets), but there is a significant price to pay in doing so, namely that one has to deal with sets on a “virtual” or “incomplete” basis, rather than with the “completed infinities” that one is accustomed to in the standard modern framework of mathematics. Conceptually, this marks quite a different approach to mathematical objects, and assertions about such objects; such assertions are not simply true or false, but instead require a certain computational cost to be paid before their truth can be ascertained. This approach makes the mathematical reasoning process look rather strange compared to how it is usually presented, but I believe it is still a worthwhile exercise to try to translate mathematical arguments into this computational framework, as it illustrates how some parts of mathematics are in some sense “more infinitary” than others, in that they require a more infinite amount of computational power in order to model in this fashion. It also illustrates why we adopt the conveniences of infinite set theory in the first place; while it is technically possible to do mathematics without infinite sets, it can be significantly more tedious and painful to do so.

**Update, Mar 21: Several edits and additions, including a summary.**

## 18 comments

Comments feed for this article

20 March, 2010 at 10:56 am

Qiaochu YuanVery enlightening! As you say, one way to distinguish Cantor’s theorem from other set-theoretic statements is by just how few of the properties of Set you need to prove it. In fact, a Cantor-like statement makes sense in any Cartesian closed category. First, recall that in a category with a terminal object , a point of an object is a map .

Theorem (Lawvere and Schanuel,Suppose is an object in a Cartesian closed category such that there exists an object and a map with the property that for every map there exists a point of such that names . Then every map has a fixed point.Conceptual Mathematics, p. 316):To recover the usual statement of Cantor’s theorem one takes in Set. From this perspective, Cantor’s theorem should be thought of as a statement about typed lambda calculus.

20 March, 2010 at 9:09 pm

Martin CohenTwo comments:

On a somewhat facetious note, in sci.math, Archimedes Plutonium has been describing his reconstruction of mathematics where the maximum integer is 10^500.

The discussion of n+1 > n reminds me of some work I did in the 1970’s as part of a group at USC’s ISI doing program verification using Pascal. I constructed two versions of routines that implemented the pigeon-hole principle in the form if f maps 0 to n (integer args and results) into 1 to n, then there are two distinct values i and j such that f(i) = f(j). The first version had nested loops on i and j and stopped when f(i) = f(j). The second version, to avoid the O(n^2) time of the first, stored f(0 to i) in an array and stopped when f(i) matched a previous value (the traditional time-space tradeoff). As I recall, I managed to prove the two routines were correct, but it was surprisingly difficult.

21 March, 2010 at 6:29 am

Sune Kristian JakobsenFirst of all, thanks for yet another great post!

About finding an element in not covered by , where the lengths of the intervals sums to less that :

“This can be done, but it requires a nontrivial amount of computational power – basically the ability to run a double infinite loop (indexed by the ordinal .”

Why doesn’t the following work: For each n, approximate the smallest element in (as defined in the post) with n bits precision. For each n, this can be done in finite time, so after -step we have a real number contained in .

Do anyone know some good references about infinite computations?

21 March, 2010 at 8:39 am

Terence TaoHmm, that’s a good point. I was misled by the fact that the Heine-Borel theorem is known to be equivalent (in the reverse-mathematics sense) to the (weak) Konig lemma, which requires omega^2 worth of computational resources to prove; but you are indeed right that one can show that the intersection of nested closed elementary sets is non-empty using only omega worth of computation (reverse mathematics says that this is an RCA_0 result rather than a WKL_0 result). Somehow the point is that a “lim sup” or “lim inf” usually requires two infinite operations (an inf of a sup, or a sup of an inf) but when dealing with elementary sets rather than arbitrary closed sets, the first infinite operation can in fact be done in finite time, saving one power of omega. I’ve rewritten this portion of the post to reflect the simplification.

I don’t know of good references for the computational point of view, though it is perhaps implicit in the reverse mathematics literature. (But their use of ordinals to measure the strength of a logical system is slightly different from the use here to measure the computational power, though there is clearly a relationship between the two.)

2 April, 2010 at 2:53 am

asdfSune, you might look at the papers about “infinite time Turing machines” by Joel David Hamkins. They are Turing machines whose steps are indexed by potentially transfinite ordinals. If I remember correctly they can decide all arithmetic statements and maybe some analytic ones. See: http://jdh.hamkins.org/Publications

21 March, 2010 at 4:29 pm

Top Posts — WordPress.com[...] A computational perspective on set theory The standard modern foundation of mathematics is constructed using set theory. With these foundations, the [...] [...]

22 March, 2010 at 2:16 am

guestProfessor Tao’s “computational” framework may well have some merit from an ultrafinitist point of view, but it otherwise seems fairly involved and counterintuitive. It’s well known that mathematics can be interpreted in a constructivist framework, and that Cantor’s diagonal argument carries through to a proof that the set of computable real numbers is itself uncomputable (See e.g. this post on the Math. and Computation weblog). Nevertheless, most constructivist mathematicians use foundations other than Zermelo Fraenkel set theory or its constructive equivalents. (one issue with set theoretical foundations for constructive mathematics is that they treat sets, functions and quotient sets as extensional, whereas extensional equality of such objects may not even be algorithmically decidable.)

A speculative question is whether constructivist foundations of mathematics can be built which represent computations of bounded computational complexity. The field of descriptive complexity may be of some help here. Alternately, a well-known isomorphism between algorithms and constructive proofs allows us to use formal systems based on linear logic to encode algorithms with polynomial-time complexity; other systems exists which can encode different complexity classes.

23 March, 2010 at 4:15 pm

András SalamonBounded computational resources have so far been used for specific applications, such as game theory. I like your vision of applying this perspective to the foundations of mathematics. Starting with bounded computation seems quite distinct from constructivism.

In your comments on Cantor’s Theorem, did you perhaps mean to use the -th digit of ?

Also, are you sure that in an oracle set theory, the difference between and 1 is necessarily irrelevant? Identifying the equivalent strings seems to require access to the global overview that oracle set theory can’t take for granted.

24 March, 2010 at 6:19 pm

Terence TaoThanks for the correction!

If one assumes that one is able to apply the comparison operators <, > to real numbers in finite time, then one can extract out each digit of a real number in finite time (or, in the case that there are two different representations, both possible digits can be extracted).

24 March, 2010 at 1:32 am

Tibor FöldesFor a reliable framework in this directions see:

Alexander Stepanov and Paul McJones (2009).

Elements of Programming. Addison-Wesley. ISBN 978-0-321-63537-2.

24 March, 2010 at 3:35 am

András SalamonI don’t understand how this relates to Terry’s post? There is a single reference to the concept “set” in the index of the book. Chapter 2 appears to be exactly what I would expect in a programming language text with some of the links to algebra made slightly more explicit than is usual. What am I missing?

24 March, 2010 at 5:30 am

Tibor FöldesIMHO this is not a programming language text with some references to algebra but it is algebra . This is a demonstration of the possibility of a pure algorithmic build up of algebra. Its Templates and Concepts ( like Terry’s algorithms and oracles) make algebraic structures programmable without too much concerns about the infiniteness of the domain on they work. This is maybe an example how could be realized a net of such adapting and non-adapting oracles.

Its ground idea is a net of Concepts with minimal dependencies among each other and with the Domain on they are applicable.

It is worthy to skim through the book.

24 March, 2010 at 6:52 am

anonymous“At one extreme, one can enforce a “strict finitist” viewpoint where the total computational resources available (time and memory) are bounded by some numerical constant, such as 10^100; roughly speaking, this causes any mathematical construction to break down once its complexity exceeds this number”.

I think frequently about how to obtain such a constant in a natural way so that computational complexity which is now a purely formal tool transforms into a tool which can be applied to natural sciences. I think about a ressource measure so that no matter how much additional ressource your natural system has beyond this it would not achieve much more. For instance the number of units packed in the most compact physical way that can broadcast a message in a given time (i.e. a milisecond second). If your genes in the cell takes more than this time to comunicate and react, you can not survive in any environment. If the cells in your brain takes more than this time, you can not perceive.

With such a natural constant, the P versus NP question for scientific problems will not be relevant, since nobody will worry about asymptotics (Big O) nor if the algorithm is polynomial or exponential. Just substitute the n by the constant and see which one is faster. I´ve always considered weird and unsatisfactory in computational complexity the fact that you can always increase the value of the degree of the polynomial. It would be nice if we could bound it in any natural way.

25 March, 2010 at 7:51 pm

The definition of a function « Annoying Precision[...] this way, and it won’t always be done this way. For example, consider Terence Tao’s recent blog post where he replaces sets and functions with oracles and algorithms. There’s no reason to expect [...]

13 April, 2010 at 8:11 pm

CosmonutThe above discussion suggests that it is possible to retain much of the essential mathematical content of set theory without the need for explicitly dealing with large sets (such as uncountable sets), but there is a significant price to pay in doing so, namely that one has to deal with sets on a “virtual” or “incomplete” basis, rather than with the “completed infinities” that one is accustomed to in the standard modern framework of mathematics

………………………………

This reminds me of the ancient Greek idea of ‘potential infinities’ versus ‘actual infinities’.

Great post !

13 April, 2010 at 8:16 pm

Is there a countable certificate for connectedness? « What’s new[...] somehow be “verified in countable time” in a suitable oracle model, as discussed in my previous post, though I have not tried to make this vague specification more [...]

16 April, 2010 at 11:30 am

Chad GroftThis article reminds me a bit of the way Paul Cohen described forcing, in some of his last lectures. Consider that a theory

Tis consistent (in the normal logical sense) iff there is a consistent oracleE(consistent in the article’s sense) which responds “yes” or “no” to inputs, and* always responds “yes” to sentences in

T;* follows the rules of logic,

e.g., if _E_ has already said “yes” to φ and ψ then it must say “yes” to φ ∧ ψ, andvice versa; and in particular* always responds “no” to contradictions.

Cohen’s idea essentially was that

Ecould be adaptive.Let’s say you want to pretend that there is a set

Gof natural numbers which is not constructable. If somebody asks the oracle “Is 5 ∈G?”, thenEmust come up with (and remember) the answer. After finitely many such queries,Eremembers a finite amount of information.There are more complicated queries, such as “Is

G=X?” (whereXis some actual set), or “Is the complement ofGfinite?”, or “Is the smallest element ofGprime?”. The oracleEcan handle these by looking “ahead” to larger amounts of information, determining how simpler queries would be answered, and answering the complex query “now” in a way which creates the least amount of trouble. The first two queries above would be answered “no” with no extra memorization, the first because it determines every element ofG, the second because it fixes a cofinite (if indeterminate) amount of information. For the third,Ewould first fix enough information that it could tell what the smallest elementnofGis, then answer truthfully whethernis prime or not.When this is done correctly, considering all the sets which can be named with

Gand actual sets as parameters and how they would relate, and carefully defining how queries should be decomposed as smaller queries, it happens thatEis a consistent oracle which always says “yes” to sentences in ZFC + “there is a nonconstructible subset of ω”.More complicated math, usually involving the combinatorics of partial orders, is required for other results, but this is the basic idea. I wish I knew of a good written source for this perspective; most introductions to forcing, even Cohen’s own works, talk about countable transitive models instead.

19 November, 2010 at 9:07 am

rgrigIn Section 2, there is a bug in the algorithm that generates all finitely constructible reals.

I’m repeating here how I understood the problem, because it might be my misunderstanding rather than a bug. We are given procedures , , , . Some of them output a real in finite time, others keep running. The task is to find an algorithm , such that if is the output of some , then for some .

Now assume that and with run in time steps. The output of is skipped.

The simplest algorithm I could thing of works as follows. Enumerate the points with non-negative coordinates. If takes

exactlysteps then output its result, otherwise output .[Corrected, thanks - T.]