The standard modern foundation of mathematics is constructed using set theory. With these foundations, the mathematical universe of objects one studies contains not only the “primitive” mathematical objects such as numbers and points, but also sets of these objects, sets of sets of objects, and so forth. (In a pure set theory, the primitive objects would themselves be sets as well; this is useful for studying the foundations of mathematics, but for most mathematical purposes it is more convenient, and less conceptually confusing, to refrain from modeling primitive objects as sets.) One has to carefully impose a suitable collection of axioms on these sets, in order to avoid paradoxes such as Russell’s paradox; but with a standard axiom system such as Zermelo-Fraenkel-Choice (ZFC), all actual paradoxes that we know of are eliminated. Still, one might be somewhat unnerved by the presence in set theory of statements which, while not genuinely paradoxical in a strict sense, are still highly unintuitive; Cantor’s theorem on the uncountability of the reals, and the Banach-Tarski paradox, are perhaps the two most familiar examples of this.
One may suspect that the reason for this unintuitive behaviour is the presence of infinite sets in one’s mathematical universe. After all, if one deals solely with finite sets, then there is no need to distinguish between countable and uncountable infinities, and Banach-Tarski type paradoxes cannot occur.
On the other hand, many statements in infinitary mathematics can be reformulated into equivalent statements in finitary mathematics (involving only finitely many points or numbers, etc.); I have explored this theme in a number of previous blog posts. So, one may ask: what is the finitary analogue of statements such as Cantor’s theorem or the Banach-Tarski paradox?
The finitary analogue of Cantor’s theorem is well-known: it is the assertion that for every natural number
, or equivalently that the power set of a finite set
of
elements cannot be enumerated by
itself. Though this is not quite the end of the story; after all, one also has
for every natural number
, or equivalently that the union
of a finite set
and an additional element
cannot be enumerated by
itself, but the former statement extends to the infinite case, while the latter one does not. What causes these two outcomes to be distinct?
On the other hand, it is less obvious what the finitary version of the Banach-Tarski paradox is. Note that this paradox is available only in three and higher dimensions, but not in one or two dimensions; so presumably a finitary analogue of this paradox should also make the same distinction between low and high dimensions.
I therefore set myself the exercise of trying to phrase Cantor’s theorem and the Banach-Tarski paradox in a more “finitary” language. It seems that the easiest way to accomplish this is to avoid the use of set theory, and replace sets by some other concept. Taking inspiration from theoretical computer science, I decided to replace concepts such as functions and sets by the concepts of algorithms and oracles instead, with various constructions in set theory being replaced instead by computer language pseudocode. The point of doing this is that one can now add a new parameter to the universe, namely the amount of computational resources one is willing to allow one’s algorithms to use. At one extreme, one can enforce a “strict finitist” viewpoint where the total computational resources available (time and memory) are bounded by some numerical constant, such as ; roughly speaking, this causes any mathematical construction to break down once its complexity exceeds this number. Or one can take the slightly more permissive “finitist” or “constructivist” viewpoint, where any finite amount of computational resource is permitted; or one can then move up to allowing any construction indexed by a countable ordinal, or the storage of any array of countable size. Finally one can allow constructions indexed by arbitrary ordinals (i.e. transfinite induction) and arrays of arbitrary infinite size, at which point the theory becomes more or less indistinguishable from standard set theory.
I describe this viewpoint, and how statements such as Cantor’s theorem and Banach-Tarski are interpreted with this viewpoint, below the fold. I should caution that this is a conceptual exercise rather than a rigorous one; I have not attempted to formalise these notions to the same extent that set theory is formalised. Thus, for instance, I have no explicit system of axioms that algorithms and oracles are supposed to obey. Of course, these formal issues have been explored in great depth by logicians over the past century or so, but I do not wish to focus on these topics in this post.
A second caveat is that the actual semantic content of this post is going to be extremely low. I am not going to provide any genuinely new proof of Cantor’s theorem, or give a new construction of Banach-Tarski type; instead, I will be reformulating the standard proofs and constructions in a different language. Nevertheless I believe this viewpoint is somewhat clarifying as to the nature of these paradoxes, and as to how they are not as fundamentally tied to the nature of sets or the nature of infinity as one might first expect.
— 1. A computational perspective on mathematics —
The great advantage of using set theory in mathematics is that all objects in a given set (e.g. all numbers in the real line ) are available to you at all times; one can take one, many, or all objects in a set and manipulate them as one wishes (cf. the axiom schema of replacement); similarly, one can assign a truth value to a statement that quantifies over an arbitrary number of objects. If one removes sets from the picture, then one no longer has immediate access to arbitrary elements of a set, and one can no longer perform operations en masse on all the elements of a set at once; instead, one must use some (possibly more restrictive) protocol for manipulating objects in a class, or verifying whether a given statement is true or false. (In more philosophical terms, we are focusing more on an epistemological approach to mathematics, based on what we can measure and query, as opposed to an ontological approach, based on what we believe to exist.)
For this, it is convenient to use the conceptual framework that is familiar to us through modern computer languages, such as C. In this paradigm, when dealing with a class of objects (e.g. integers), we do not get access to the entire set of integers directly. Instead, we have to declare a integer variable, such as
, and set it equal to some value, e.g.
; or, if one is creating a routine that takes input,
might be initialised to one of the unspecified inputs of that routine. Later on, we can use existing variables to define new ones, or to redefine existing ones, e.g. one might define
to equal
, or perhaps one can increment
to
. One can then set up various loops and iterations to explore more of the parameter space; for instance, if countably infinite loops are permitted as a computational resource, then one can exhaust the positive integers by starting at
and incrementing
by
indefinitely; one can similarly exhaust the negative integers, and by alternating between the two (and also passing through
) one can exhaust the entire integers by a countably infinite loop. This of course is just the standard demonstration that the integers are countable.
Real-world computers, of course, have finite limits of precision; they cannot represent arbitrarily large integers, but only integers up to a certain size (e.g. or
). One could think about computational models with such a strict finitary limitation, but let us assume that we are in a more idealised setting in which there are no limitations on how large an integer one can store. Let us then make the even more idealised assumption that we can also store real numbers with unlimited precision; our computer never makes any roundoff errors. (Note that there are indeed arbitrary precision models of computation that can do this, though the catch is that speed of computation depends heavily on how complicated it is to describe any given number.)
Remark 1 A technical point: it may be that the computational model of the real numbers is different from the standard real line
; for instance, perhaps the computer only implements “constructible” real numbers, which is for instance the case in physical arbitrary precision computers. We will touch upon this point again later.
Note that if one were to expand out a given real number, say , as a decimal expansion, then one would obtain an infinite string of digits. But, just as we do not have direct access to the set
of all integers, we will not have direct access to the entire decimal expansion of
. If we have a natural number
, we are allowed to inspect the
digit of
by making a suitable function call (e.g.
), and if we are allowed to set up an infinite loop in which
starts at
and increments indefinitely, one can exhaust all the digits of
, which is good enough for most “practical” mathematical purposes. (For instance, if one were allowed to run programs of countable length using real arithmetic, one could make a program that determined whether
was normal or not, or to determine the truth of the Riemann hypothesis, or more generally to compute the truth-value of any first-order sentence in the theory of the real line.)
We assume that the usual arithmetic operations can be performed on real numbers in reasonable amounts of time. For instance, given two real numbers , one can determine whether they are equal or not in finite time (consulting some sort of “equality oracle” for the reals if necessary). Note that equality can be a subtle issue; if one thinks of real numbers as infinite strings of digits, their equality can only be verified directly by using a countable amount of computing power. But we will sidestep the issue of how exactly the reals are implemented by simply assuming that enough oracles exist to perform real arithmetic at an acceptable computational cost.
As already hinted at in the above discussion, we are assuming that our computer has access to a certain amount of computing resources (e.g. time, memory, random number generation, oracles). We will be rather vague on exactly how to formalise the concept of a resource, but basically the standard definitions used in computer science would be a good approximation here, at least when the resources are finite. But one can consider allowing for certain computational resources to be infinite in some carefully controlled manner; for instance, one could consider a situation in which countably infinite loops are permitted (provided that all variables in the loop that one wants to retain “converge” in some sense at the end of the loop), but for which uncountable loops are not allowed. We will not formalise such concepts here (but roughly speaking, they correspond to allowing transfinite induction up to some specified ordinal, and no further).
We will not be using sets or set-theoretic functions in this computer language. However, we will use as a substitute the concept of an oracle – a “black box” routine that takes zero or more variables of a given class as input, and returns zero or more variables in various classes as output (usually we will have just a single output, but it will be convenient to allow multiple outputs). Being a black box, we do not know how the oracle obtains the output from the input, but we are able to use the oracle anyway. Let us assume that each invocation of an oracle takes some acceptable amount of time (e.g. bounded time when the computational resources are finite, or countably infinite time if countable time resources are allowed, etc.). All our oracles are consistent in the sense that they always produce the same output for a fixed choice of input; thus, if one calls the oracle again at some later time with the same input, then the oracle will return the same output as it did before. It is important to note that consistency is a subtly weaker assumption than requiring the oracle is non-adaptive; we allow the oracle to “remember” previous queries, and to use that memory to formulate answers to later queries, as long as it does not contradict the outputs it gave previously.
We will be concerned primarily with membership oracles – an oracle that takes a variable
in a given class and returns an answer
that is either “Yes” or “No”. Informally, the oracle
is describing some subset of this class, and is answering questions regarding whether any given variable
lies in this set or not. Note, however, that this set only exists in a “virtual” or “potential” sense; if the oracle is adaptive, the set of inputs that will give a “Yes” answer may not yet be fixed, but could depend on future queries to the oracle. If the class can be exhausted within the computational resources permitted (e.g. if the parameter space can be countably enumerated by the computer, and countably infinite loops are permitted), then one can query the oracle for every single element and thus pin down the set completely (though one may not be able to store this set if one does not have sufficient memory resources!), but if the parameter space is too large to be exhausted with the available resources, then the set that the oracle is describing will never be completely described.
To illustrate this, let us briefly return to the traditional language of set theory and recall the following textbook example of a non-measurable subset of the reals
. This set is constructed by partitioning
into cosets
of the rationals
, and then using the axiom of choice to selecting a single representative of each coset; the set
is the collection of all such representatives. Thus the rational translates
,
partition
, and it is not hard to then deduce (using the properties of Lebesgue measure) that
cannot be Lebesgue measurable.
We can simulate this non-measurable set by an adaptive oracle , which remembers all prior queries to itself, and works as follows:
-
takes a real number
as input.
- If
has previously answered the question
of whether
lies in
, repeat whatever answer was given previously.
- If
has not been queried before, and if furthermore no rational translate
of
has been queried before either (i.e.
differs from all previous queries by an irrational number), then answer
with “yes”.
- Finally, if
has not been queried before, but
has been queried before for some rational
, then answer
with “no”.
- Store
(and
) in memory for use in future queries.
Assuming one has a rationality oracle that can tell (in bounded time) whether a given real number
is rational or not, then
is a perfectly programmable oracle, which will run in finite time whenever one asks only a finite number of queries of it. (After querying the oracle an infinite number of times, though, it will require an infinite search loop in order to make sure that any subsequent answer it gives is consistent with all previous answers, unless one is allowed to use an infinite amount of memory, e.g. an array indexed by the quotient space
.) It is even completely deterministic – it requries no arbitrary choices on the part of the oracle. If one had the patience to query this oracle for every single real number
(which would, of course, require an uncountable number of queries), the oracle would eventually describe completely the non-measurable set
. But if one is only permitted a finite or countable number of queries, then the non-measurable set only exists in some “virtual” sense.
More generally, non-adaptive oracles tend to generate measurable sets, while adaptive oracles are likely to generate non-measurable sets. So we see that non-measurability does not have to be viewed as a quirk arising from the nature of infinity, or from the axiom of choice; it can be viewed instead as the freedom to adapt to previous queries to membership of the set, which is a concept that makes sense even in a strictly finitist setting.
Remark 2 One can think of an non-adaptive oracle as being like a truthful observer, reporting on some objective set that exists independently of the queries, while an adaptive oracle is more like a pathological liar, inventing a previously non-existent set on the fly as needed in order to consistently answer the questions posed of it. It is then not so surprising that the set thus invented is likely to be non-measurable. We thus see that the ability of oracles to adapt is somewhat analogous to the role the axiom of choice plays in traditional set theory.
We will discuss measurability and non-measurability in more detail a little later in this post.
We have seen that membership oracles are sort of like “virtual” sets. Many set operations can be simulated for membership oracles. For instance, given two membership oracles that apply to variables
in some class (e.g. the real numbers), one can form the union oracle
, which works by querying both
and
and returning “Yes” if at least one of
and
was “Yes”, and “No” otherwise. More generally, any finite boolean operation of membership oracles gives another membership oracle, and (if countable computational resources are available) the same is true for countable boolean operations also. As one increases the amount of computational resources available, more and more set-theoretic operations become available, and when one allows unlimited resources (or more precisely, transfinite induction up to any ordinal, and storage of arbitrarily sized infinite sets), then all the standard operations in set theory (e.g. invocations of the axiom schema of replacement, the power set axiom, the axiom of choice, etc.) become available.
Remark 3 Membership oracles
only describe a raw, unstructured set. If one wants to place additional structure on a set (e.g. measure structure, topological structure, smooth structure, etc.) then additional oracles would be needed. Of course, the same is true in traditional set theory; for instance, to place a topology on a set
one also needs to specify a collection
of open sets on
obeying the axioms of a topology, and similarly for other structures one can place on sets. We will see an example of these additional oracles later in this post, when we revisit the concept of measurability.
Remark 4 Membership oracles are weaker than sets in many ways. One of these comes from a breakdown of the law of the excluded middle. In set theory, a statement about a set
is either true or false; for instance,
is either finite or infinite. If one is instead given an adaptive membership oracle
, questions such as whether
describes a finite set or not are undecidable if one only has finite computational resources. However, one can imagine strengthening the membership oracle
to a oracle that, in addition to answering questions about membership of individual elements, will also answer more general questions about
(such as whether
is finite), in such a fashion that all the answers given are consistent with each other. In such a way, the law of the excluded middle can be restored; but then programming an adaptive oracle in a way that keeps all the answers consistent becomes quite a challenge. (Note though that the Gödel completeness theorem asserts, in some sense, that this is always possible, provided that one’s initial constraints on
are originally consistent.)
— 2. Cantor’s theorem —
Cantor’s theorem (and its proof) transfers easily enough from sets to oracles. The analogue of a (proposed) enumeration of the real numbers is an “enumeration oracle” that takes a natural number
as input, and returns a real number
as output; we allow repetitions. The Cantor diagonal argument shows that given any such putative enumeration
, and given access to a countable amount of computing resources, one can construct a real number
which is guaranteed not to be covered by the enumeration oracle; for instance, one can construct
to be a string of decimals
, with
chosen to be the first digit in
(say) not equal to the first digit of
,
chosen to be the first digit of
not equal to the second digit of
, and so forth. (I exclude the digits
and
to avoid the technical and irrelevant
issue.)
This is of course virtually identical to the usual proof of Cantor’s theorem. But the proof highlights that in order to exhibit a counterexample to the claim that enumerates the reals, one needs a countably infinite amount of computational resources. And indeed if one works in a finitary computational model in which one is only allowed to run programs that are guaranteed to halt, and if one limits one’s real number system to the “finitely constructible” reals – those reals that can be generated by halting programs – then one can indeed enumerate the “real numbers” in this system, by enumerating all the possible programs to generate real numbers as
, and computing
to be the output of
, where
is the first integer such that
halts in less than
steps, while outputting a number different from
(setting
if no such integer exists). In this model, the real number
given by Cantor’s argument is not finitely constructible, but only countably constructible. It is only because one has access to countably infinite computational resources (in particular, the ability to sum convergent infinite series such as
), that the reals are no longer countable.
Let us now consider a slightly different version of Cantor’s theorem, which in the language of set theory asserts that for a given set (e.g. the natural numbers
), that one cannot enumerate the power set
by
itself. In order to phrase this using the language of oracles rather than sets, one needs to be able to treat oracles themselves as variables; note that many modern computer languages (particularly functional languages such as Lisp and Haskell) already do this. In particular, we allow for the existence of oracles that generate further oracles as output, or themselves take oracles as input.
A proposed enumeration of the power set of the natural numbers (say) by the natural numbers themselves can then be viewed as an enumeration oracle which takes a natural number
as input, and returns a membership oracle
in the natural numbers as output; thus
itself is able to accept a natural number
as input and return a yes-or-no answer
as output.
Cantor’s diagonal argument then asserts that given any proposed enumeration oracle of the above form, one can always find a membership oracle
which is not enumerated by
. Indeed, one simply sets
to be the negation of
for any given natural number input
.
With this version of Cantor’s argument, we no longer need a countably infinite amount of computational resources; the above argument is valid even in the finitary computational model. In that model, the argument shows that the class of oracles that terminate in finite time cannot itself be enumerated by an oracle that terminates in finite time; this is essentially Turing’s halting theorem. Thus we see that Turing’s theorem and Cantor’s theorem are simply different cases of a single general theorem, the former in the case of finitary computational resources and the latter in the case of unlimited computational resources. (See also my previous blog post for further discussion of these “no self-defeating object” arguments.)
Also observe that this version of Cantor’s argument works if the natural numbers are replaced by any other class of variable; for instance, in the finite class of integers between and
, the argument demonstrates that the membership oracles in this class cannot be enumerated by the numbers from
to
itself, thus
.
Let us now discuss the analogous situation with the inequality . The claim is now that if
is a finite class, and
is a strictly larger class (containing at least one additional element
outside of
), and one is given an enumeration oracle
that takes an variable
in
as input and returns a variable
in
as output, then one should be able to find a variable
in
or equal to
which is not covered by the enumeration oracle.
One way to find such a is by the following algorithm, which is just a painfully disguised version of the proof that
using induction. Let
be an element in
that is not in
. We first search through all the elements
of
to see if there is a solution to the equation
. If no solution exists, then we are done; we can just take
. So suppose instead that we have some found some
for which
. Then we can delete
, and all other solutions to
, from the domain
of the oracle
, and also delete
from the output
of the oracle, giving rise to a modified oracle
which takes as input a variable
with
, and returns the output
. Note that if we ever find a variable
in the range of
that is not enumerated by
, then
will not be enumerated by
either. So we can descend from
to the oracle
, which has a smaller domain and range. Also note that the range remains strictly larger than the domain, as
lies in the range but not in the domain. We can thus keep iterating this procedure; since we cannot have an infinite descent, the algorithm must eventually terminate. Unpacking the termination condition, we have indeed produced an element
in the range of
which was not enumerated by
.
We observe that this algorithm only halts due to the principle of infinite descent, which is only valid when the initial class one starts with was finite. For infinite classes, which admit infinite descents, one can of course find surjective enumerations between such classes and strictly larger classes; for instance, the enumeration oracle that sends a natural number to
is an enumeration of
by
. In contrast, the proof of Cantor’s theorem does not rely on facts specific to finite classes, such as the principle of infinite descent, and is thus valid in arbitrary classes.
— 3. Measurability revisited —
In a previous section, we gave an example of a membership oracle in the real line which was non-measurable, in the sense that the set that it was (virtually) describing necessarily failed to be Lebesgue measurable. But the concept of Lebesgue measurability was still defined in the context of set theory, and not in a purely oracle-theoretic language. It is then natural to ask whether one can define Lebesgue measurability purely in the context of computational models, and in particular whether one can discuss this concept in a finitary computational model.
For simplicity let us now work in a bounded subset of , such as the unit interval
, so that we can work with a finite measure rather than a
-finite measure.
From classical measure theory, we recall the following characterisation of Lebesgue measurability: a subset of the unit interval is Lebesgue measurable if for every
, one can find an elementary subset
of the unit interval (i.e. a finite union of open intervals) whose set-theoretic difference
with
has outer measure less than
, or equivalently that it can be covered by a countable collection of intervals
whose length adds up to less than
.
Thus, if one wants a membership oracle to “certify” its measurability, one way to do so is to provide an additional “measurability oracle”
, which takes a positive real number
as input, and returns three outputs:
- A description of an elementary set
(which can be done in a finite amount of time and space, by specifying the number of intervals used to form
, together with their endpoints);
- An interval oracle
, that for each natural number input
returns an interval
;
- A covering oracle
, which for each real number
in
, returns a natural number
for which
. (Actually, this oracle is redundant, as it can be simulated from the previous two outputs by a finite brute force search.)
With these oracles, one can then verify the measurability of by selecting an
, and verifying firstly that after selecting any natural number
, the sum of the lengths of
remains less than
, and secondly that after selecting any real number
, that if
lies in
but not in
or vice versa, that
indeed lies in
. In principle, if one performed this verification procedure an uncountable number of times (once for each choice of
) one would fully demonstrate the measurability of
; but if one instead only had access to finite or countable computational resources, then one could only verify measurability on an “as needed” basis.
So, in the oracle model, a measurable set is not simply a membership oracle ; it must also be supplemented with an additional measurability oracle
that “witnesses” the measurability. This is analogous to how sets must be augmented with (say) topological structure if one wants to perform topology on that set, or algebraic structure if one wants to perform algebra on that set, etc.
If one possesses a measurability oracle for a set
(or more precisely, a membership oracle
), then can estimate the Lebesgue measure of
to within accuracy
by calling
to obtain an approximant
, and then computing the measure
of
(which can be done in a finite amount of time, as
is simply a finite union of intervals). A key fact (which, not coincidentally, is crucial in the standard construction of Lebesgue measure) is that these approximations to the Lebesgue measure of
are compatible with each other in the following sense: if one calls one measurability oracle
for
at accuracy
to get one estimate
for the Lebesgue measure of
, and if one then calls another (possibly different) measurability oracle
for the same set
at another accuracy
to get another estimate
for
, then these two estimates can only differ by at most
; in particular, sending
, one obtains a Cauchy sequence and (after a countable number of operations) one can then compute the Lebesgue measure of
to infinite precision.
This key fact boils down (after some standard manipulations) to the fact that an interval such as has outer measure at least
; in our oracle based model, this means that if one is given an interval oracle
that generates open intervals
for each natural number
, in such a way that the total length
of these intervals is less than
, then one can find a point in
which is not covered by any of the
.
This can be done using a countable amount of computational power (basically, the ability to run a single infinite loop; this is roughly equivalent to the theory in reverse mathematics). The point is that for each finite
, the set
is a computable non-empty finite union of closed intervals in
, which decreases in
. The infimum
can be computed infinite time for each
, and increases in
; the limit
is then an element in
that lies in
but is outside all of the
. Thus, given a countable amount of computational power, one can consistently define Lebesgue measure of a measurable set, and verify its basic properties.
It is instructive to apply the above discussion to the non-measurable membership oracle given in previous sections (trivially modified to lie in
rather than in
). If one is given a purported measurability oracle
for this oracle
, one can eventually show that this oracle
does not actually certify the measurability of
as claimed, but this requires a countably infinite amount of computation to establish. (Basically, there are two cases, based on whether
asserts that
has positive Lebesgue measure or not (which can be decided after a countable amount of computation). If it has positive measure, then by invoking
with
less than half this measure, one can soon find an interval
in which
has density greater than
(or more precisely, the complement of
in
has outer measure strictly less than
) and then one can run a variant of the above Bolzano-Weierstrass argument to find two points
in
and in
which differ by a rational, contradicting the construction of
. If instead
asserts that
has zero measure, then
can cover
by intervals of arbitrarily small total measure, and then
can do the same for the union of all the rational shifts of
, and one can then find an element
such that no rational shift
of
lies in
.)
On the other hand, if one is only allowed to query and
finitely many times, then one can show that one can adaptively build
and
in response in these queries in such a way that one never obtains a contradiction, while retaining the properties that the shifts of
by the rationals partition
. So a pathological liar can build a non-measurable set but claim that it is measurable; the deception can sustain any finite number of queries about the set and its measurability, but given a countable amount of queries one can eventually demonstrate that there is an inconsistency (at least for non-measurable sets coming from the coset partition of
by the rationals).
Remark 5 Interestingly, the type of adaptive oracles one uses to create non-measurable sets are not compatible with random number generation. If one has access to a source of random real numbers (say in
), then in principle one can (almost surely) compute the Lebesgue measure in countable time of any subset
of
accessed through a random oracle
by the Monte Carlo method: randomly sampling
points in
, counting the proportion that lie in
, and then sending
. However this only works properly if the oracle
does not adapt to the various random inputs it is given, and is instead independent of these inputs. (For instance, if one were to use Monte Carlo methods to measure the non-measurable set
described above, one would almost surely get that
has measure
, as each random number is almost certain to lie in a different coset of
!.)
— 4. The Banach-Tarski paradox —
We now turn to the Banach-Tarski paradox. The usual formulation of this paradox involves a partition of the unit ball into pieces that can be rearranged (after rotations and translation) to form two copies of the ball. To avoid some minor technicalities, we will work instead on the unit sphere with an explicit countable set
removed, and establish the following version of the paradox:
Proposition 1 (Banach-Tarski paradox, reduced version) There exists a countable subset
of
and partition of
into four disjoint pieces
, such that
and
can be rotated to cover
, and
and
can also be rotated to cover
.
Of course, from the rotation-invariant nature of Lebesgue measure on the sphere, such a partition can only occur if at least one of are not Lebesgue measurable.
We return briefly to set theory and give the standard proof of this proposition. The first step is to locate two rotations in the orthogonal group
which generate the free group
. This can be done explicitly; for instance one can take
(See for instance this post from the Secret Blogging Seminar for more discussion of this example.) Each rotation in has two fixed antipodal points in
; we let
be the union of all these points. Then the group
acts freely on the remainder
.
Using the axiom of choice, we can then build a (non-measurable) subset of
which consists of a single point from each orbit of
. For each
, we then define
to be the set of all points of the form
, where
and
is a word such that
-
is the identity, or begins with
(if
);
-
begins with
(if
);
-
begins with
(if
);
-
begins with
(if
).
It is then clear that partition
, while
and
both cover
, as claimed.
Now let us interpret this example using oracles. The free group is countably enumerable, and so with a countable amount of time and memory, one can construct and store
without difficulty. A membership oracle
for
can then be constructed by an adaptive oracle much as in the previous sections;
returns yes if
is the first point in its orbit that is queried, and returns no otherwise. The oracles
can then be defined by querying
for all the points in
‘s orbit in some arbitrary order until one finds the point
which lies in
, and then deciding membership in
depending on the first symbol of
. (For instance, if
‘s orbit has not been queried before, the first point in the orbit that one queries
for will lie in
; thus one sees that the order in which one decides to search through the orbit will in fact influence quite strongly what
and the
will look like.)
It is then not hard to show that do indeed partition
in the sense that for each
, exactly one of
will return “yes”. Also, for any
, at least one of
will return “yes”, and at least one of
will return “yes”. Thus, after performing an uncountable number of queries to fully complete all of these sets, we obtain sets obeying the properties in Proposition 1; but the oracles that do this are perfectly deterministic, and run in only a countable amount of time, at least for the first few queries (and one can probably even trim it down to a finite amount of time, if one has an efficient way of deciding whether two points lie in the same orbit of
).
Now we discuss how a Banach-Tarski type construction does not work in one or two dimensions. Let us just consider the one-dimensional case:
Proposition 2 There does not exist a decomposition of the unit interval
into finitely many sets
such that some translates
of these sets cover
.
We briefly review the proof of this proposition. Suppose for contradiction that we have such a decomposition. We can colour the lattice in
colours
, by giving each
-tuple
the colour
if
. This is clearly a
-colouring of
. Furthermore, for every
-tuple
, we see that at
has colour
for at least two values of
, where
is the standard basis for
. If one then uses double-counting to get two different estimates for the number of coloured points in a box
for a large enough
, one obtains a contradiction.
Note that this proof is quite finitary; given some real numbers and some membership oracles
, one could convert the above argument into an algorithm that would be able to demonstrate in finite time that either the oracles
fail to partition
, or that the translates
fail to cover
; we leave this as an exercise to the reader.
(The key distinction here between the low and high dimensional cases is that the free group is not amenable, whereas
is amenable. See these previous blog posts for further discussion.)
— 5. Summary —
The above discussion suggests that it is possible to retain much of the essential mathematical content of set theory without the need for explicitly dealing with large sets (such as uncountable sets), but there is a significant price to pay in doing so, namely that one has to deal with sets on a “virtual” or “incomplete” basis, rather than with the “completed infinities” that one is accustomed to in the standard modern framework of mathematics. Conceptually, this marks quite a different approach to mathematical objects, and assertions about such objects; such assertions are not simply true or false, but instead require a certain computational cost to be paid before their truth can be ascertained. This approach makes the mathematical reasoning process look rather strange compared to how it is usually presented, but I believe it is still a worthwhile exercise to try to translate mathematical arguments into this computational framework, as it illustrates how some parts of mathematics are in some sense “more infinitary” than others, in that they require a more infinite amount of computational power in order to model in this fashion. It also illustrates why we adopt the conveniences of infinite set theory in the first place; while it is technically possible to do mathematics without infinite sets, it can be significantly more tedious and painful to do so.
Update, Mar 21: Several edits and additions, including a summary.

41 comments
Comments feed for this article
20 March, 2010 at 10:56 am
Qiaochu Yuan
Very enlightening! As you say, one way to distinguish Cantor’s theorem from other set-theoretic statements is by just how few of the properties of Set you need to prove it. In fact, a Cantor-like statement makes sense in any Cartesian closed category. First, recall that in a category with a terminal object
, a point of an object
is a map
.
Theorem (Lawvere and Schanuel, Conceptual Mathematics, p. 316): Suppose
is an object in a Cartesian closed category such that there exists an object
and a map
with the property that for every map
there exists a point
of
such that
names
. Then every map
has a fixed point.
To recover the usual statement of Cantor’s theorem one takes
in Set. From this perspective, Cantor’s theorem should be thought of as a statement about typed lambda calculus.
20 March, 2010 at 9:09 pm
Martin Cohen
Two comments:
On a somewhat facetious note, in sci.math, Archimedes Plutonium has been describing his reconstruction of mathematics where the maximum integer is 10^500.
The discussion of n+1 > n reminds me of some work I did in the 1970’s as part of a group at USC’s ISI doing program verification using Pascal. I constructed two versions of routines that implemented the pigeon-hole principle in the form if f maps 0 to n (integer args and results) into 1 to n, then there are two distinct values i and j such that f(i) = f(j). The first version had nested loops on i and j and stopped when f(i) = f(j). The second version, to avoid the O(n^2) time of the first, stored f(0 to i) in an array and stopped when f(i) matched a previous value (the traditional time-space tradeoff). As I recall, I managed to prove the two routines were correct, but it was surprisingly difficult.
21 March, 2010 at 6:29 am
Sune Kristian Jakobsen
First of all, thanks for yet another great post!
About finding an element in
not covered by
, where the lengths of the intervals sums to less that
:
.”
(as defined in the post) with n bits precision. For each n, this can be done in finite time, so after
-step we have a real number contained in
.
“This can be done, but it requires a nontrivial amount of computational power – basically the ability to run a double infinite loop (indexed by the ordinal
Why doesn’t the following work: For each n, approximate the smallest element in
Do anyone know some good references about infinite computations?
21 March, 2010 at 8:39 am
Terence Tao
Hmm, that’s a good point. I was misled by the fact that the Heine-Borel theorem is known to be equivalent (in the reverse-mathematics sense) to the (weak) Konig lemma, which requires omega^2 worth of computational resources to prove; but you are indeed right that one can show that the intersection of nested closed elementary sets is non-empty using only omega worth of computation (reverse mathematics says that this is an RCA_0 result rather than a WKL_0 result). Somehow the point is that a “lim sup” or “lim inf” usually requires two infinite operations (an inf of a sup, or a sup of an inf) but when dealing with elementary sets rather than arbitrary closed sets, the first infinite operation can in fact be done in finite time, saving one power of omega. I’ve rewritten this portion of the post to reflect the simplification.
I don’t know of good references for the computational point of view, though it is perhaps implicit in the reverse mathematics literature. (But their use of ordinals to measure the strength of a logical system is slightly different from the use here to measure the computational power, though there is clearly a relationship between the two.)
2 April, 2010 at 2:53 am
asdf
Sune, you might look at the papers about “infinite time Turing machines” by Joel David Hamkins. They are Turing machines whose steps are indexed by potentially transfinite ordinals. If I remember correctly they can decide all arithmetic statements and maybe some analytic ones. See: http://jdh.hamkins.org/Publications
21 March, 2010 at 4:29 pm
Top Posts — WordPress.com
[…] A computational perspective on set theory The standard modern foundation of mathematics is constructed using set theory. With these foundations, the […] […]
22 March, 2010 at 2:16 am
guest
Professor Tao’s “computational” framework may well have some merit from an ultrafinitist point of view, but it otherwise seems fairly involved and counterintuitive. It’s well known that mathematics can be interpreted in a constructivist framework, and that Cantor’s diagonal argument carries through to a proof that the set of computable real numbers is itself uncomputable (See e.g. this post on the Math. and Computation weblog). Nevertheless, most constructivist mathematicians use foundations other than Zermelo Fraenkel set theory or its constructive equivalents. (one issue with set theoretical foundations for constructive mathematics is that they treat sets, functions and quotient sets as extensional, whereas extensional equality of such objects may not even be algorithmically decidable.)
A speculative question is whether constructivist foundations of mathematics can be built which represent computations of bounded computational complexity. The field of descriptive complexity may be of some help here. Alternately, a well-known isomorphism between algorithms and constructive proofs allows us to use formal systems based on linear logic to encode algorithms with polynomial-time complexity; other systems exists which can encode different complexity classes.
23 March, 2010 at 4:15 pm
András Salamon
Bounded computational resources have so far been used for specific applications, such as game theory. I like your vision of applying this perspective to the foundations of mathematics. Starting with bounded computation seems quite distinct from constructivism.
In your comments on Cantor’s Theorem, did you perhaps mean to use the
-th digit of
?
Also, are you sure that in an oracle set theory, the difference between
and 1 is necessarily irrelevant? Identifying the equivalent strings seems to require access to the global overview that oracle set theory can’t take for granted.
24 March, 2010 at 6:19 pm
Terence Tao
Thanks for the correction!
If one assumes that one is able to apply the comparison operators <, > to real numbers in finite time, then one can extract out each digit of a real number in finite time (or, in the case that there are two different representations, both possible digits can be extracted).
24 March, 2010 at 1:32 am
Tibor Földes
For a reliable framework in this directions see:
Alexander Stepanov and Paul McJones (2009).
Elements of Programming. Addison-Wesley. ISBN 978-0-321-63537-2.
24 March, 2010 at 3:35 am
András Salamon
I don’t understand how this relates to Terry’s post? There is a single reference to the concept “set” in the index of the book. Chapter 2 appears to be exactly what I would expect in a programming language text with some of the links to algebra made slightly more explicit than is usual. What am I missing?
24 March, 2010 at 5:30 am
Tibor Földes
IMHO this is not a programming language text with some references to algebra but it is algebra . This is a demonstration of the possibility of a pure algorithmic build up of algebra. Its Templates and Concepts ( like Terry’s algorithms and oracles) make algebraic structures programmable without too much concerns about the infiniteness of the domain on they work. This is maybe an example how could be realized a net of such adapting and non-adapting oracles.
Its ground idea is a net of Concepts with minimal dependencies among each other and with the Domain on they are applicable.
It is worthy to skim through the book.
24 March, 2010 at 6:52 am
anonymous
“At one extreme, one can enforce a “strict finitist” viewpoint where the total computational resources available (time and memory) are bounded by some numerical constant, such as 10^100; roughly speaking, this causes any mathematical construction to break down once its complexity exceeds this number”.
I think frequently about how to obtain such a constant in a natural way so that computational complexity which is now a purely formal tool transforms into a tool which can be applied to natural sciences. I think about a ressource measure so that no matter how much additional ressource your natural system has beyond this it would not achieve much more. For instance the number of units packed in the most compact physical way that can broadcast a message in a given time (i.e. a milisecond second). If your genes in the cell takes more than this time to comunicate and react, you can not survive in any environment. If the cells in your brain takes more than this time, you can not perceive.
With such a natural constant, the P versus NP question for scientific problems will not be relevant, since nobody will worry about asymptotics (Big O) nor if the algorithm is polynomial or exponential. Just substitute the n by the constant and see which one is faster. I´ve always considered weird and unsatisfactory in computational complexity the fact that you can always increase the value of the degree of the polynomial. It would be nice if we could bound it in any natural way.
25 March, 2010 at 7:51 pm
The definition of a function « Annoying Precision
[…] this way, and it won’t always be done this way. For example, consider Terence Tao’s recent blog post where he replaces sets and functions with oracles and algorithms. There’s no reason to expect […]
13 April, 2010 at 8:11 pm
Cosmonut
The above discussion suggests that it is possible to retain much of the essential mathematical content of set theory without the need for explicitly dealing with large sets (such as uncountable sets), but there is a significant price to pay in doing so, namely that one has to deal with sets on a “virtual” or “incomplete” basis, rather than with the “completed infinities” that one is accustomed to in the standard modern framework of mathematics
………………………………
This reminds me of the ancient Greek idea of ‘potential infinities’ versus ‘actual infinities’.
Great post !
13 April, 2010 at 8:16 pm
Is there a countable certificate for connectedness? « What’s new
[…] somehow be “verified in countable time” in a suitable oracle model, as discussed in my previous post, though I have not tried to make this vague specification more […]
16 April, 2010 at 11:30 am
Chad Groft
This article reminds me a bit of the way Paul Cohen described forcing, in some of his last lectures. Consider that a theory T is consistent (in the normal logical sense) iff there is a consistent oracle E (consistent in the article’s sense) which responds “yes” or “no” to inputs, and
* always responds “yes” to sentences in T;
* follows the rules of logic, e.g., if _E_ has already said “yes” to φ and ψ then it must say “yes” to φ ∧ ψ, and vice versa; and in particular
* always responds “no” to contradictions.
Cohen’s idea essentially was that E could be adaptive.
Let’s say you want to pretend that there is a set G of natural numbers which is not constructable. If somebody asks the oracle “Is 5 ∈ G?”, then E must come up with (and remember) the answer. After finitely many such queries, E remembers a finite amount of information.
There are more complicated queries, such as “Is G = X?” (where X is some actual set), or “Is the complement of G finite?”, or “Is the smallest element of G prime?”. The oracle E can handle these by looking “ahead” to larger amounts of information, determining how simpler queries would be answered, and answering the complex query “now” in a way which creates the least amount of trouble. The first two queries above would be answered “no” with no extra memorization, the first because it determines every element of G, the second because it fixes a cofinite (if indeterminate) amount of information. For the third, E would first fix enough information that it could tell what the smallest element n of G is, then answer truthfully whether n is prime or not.
When this is done correctly, considering all the sets which can be named with G and actual sets as parameters and how they would relate, and carefully defining how queries should be decomposed as smaller queries, it happens that E is a consistent oracle which always says “yes” to sentences in ZFC + “there is a nonconstructible subset of ω”.
More complicated math, usually involving the combinatorics of partial orders, is required for other results, but this is the basic idea. I wish I knew of a good written source for this perspective; most introductions to forcing, even Cohen’s own works, talk about countable transitive models instead.
19 November, 2010 at 9:07 am
rgrig
In Section 2, there is a bug in the algorithm that generates all finitely constructible reals.
I’m repeating here how I understood the problem, because it might be my misunderstanding rather than a bug. We are given procedures
,
,
,
. Some of them output a real in finite time, others keep running. The task is to find an algorithm
, such that if
is the output of some
, then
for some
.
Now assume that
and
with
run in
time steps. The output of
is skipped.
The simplest algorithm I could thing of works as follows. Enumerate the points
with non-negative coordinates. If
takes exactly
steps then output its result, otherwise output
.
[Corrected, thanks – T.]
4 July, 2016 at 10:59 am
gninrepoli
An infinite number of non-trivial zeros of the Riemann zeta function is a countable set. They all lie in the open interval (0,1). Can you give more information about these non-trivial zeros than Cantor diagonal method?
5 July, 2016 at 2:42 am
gninrepoli
Can you say something significant about the structure of
? It would be nice if we could prove the existence of a closed interval.![[z,z+j]: z>0, j>0 [z,z+j]: z>0, j>0](https://s0.wp.com/latex.php?latex=%5Bz%2Cz%2Bj%5D%3A+z%3E0%2C+j%3E0&bg=ffffff&fg=545454&s=0)
5 July, 2016 at 10:39 am
gninrepoli
Could you tell me where the error? Thank you.
5 July, 2016 at 7:13 am
gninrepoli
I’m not exactly sure. I think that the Cantor method converts two-dimensional Euclidean space
in a one-dimensional
.Again I am assuming that the open interval
may be represented as a Euclidean space
.Consider the Euclidean space of the third dimension
for example. Now we use the method of Cantor. Instead of lines
, we will hold the plane
.We have transformed the three-dimensional space into a two-dimensional (using an advanced method of Cantor). In fact, we unpacked the three-dimensional space. Now apply the inversion. That is, we will continue to pack up to infinity, but it will be impossible to unpack. To unpack the need to limit any dimension (
– where A is a constant) or a finite number of points in each dimension. Since we have a limited
, we have
.
P.S. I do not know how best to explain. It uses an enhanced version of Cantor’s method for the n-dimensional Euclidean space. But this method is used only in the reverse direction.
5 July, 2016 at 12:22 pm
gninrepoli
Instead of the Euclidean space, I mean a countable infinite set. Excuse me. I cannot express their thoughts.
5 July, 2016 at 1:09 pm
gninrepoli
Again I missed. Here I mean that there is an infinite countable space is n-dimensional. If we let n tend to infinity, then it will be infinitely uncountable. But most importantly, that the latter can be viewed as an open interval (0,1).Probably all.
18 July, 2016 at 11:25 pm
gninrepoli
Good day! From https://www.quora.com/It-is-said-that-infinitely-many-zeroes-of-the-Riemann-zeta-function-lie-on-the-critical-line-What-sort-of-infinity-is-it or http://math.stackexchange.com/questions/1491324/are-the-nontrivial-zeroes-of-the-riemann-zeta-function-countable, you can see that an infinite number of non-trivial zeros of the Riemann zeta function is countable.
. The elements of the set
may be
,
and so forth. Whether it is a subset of
(3/4, 43/100 and so on) dense in
? Yes, if
is the set of rational numbers in the interval
. Let
the number of non-trivial zeros of the Riemann zeta function at 1/2. Suppose
, where
nontrivial zeros in the interval
. Does it follow from this asymptotic equality that
is nowhere dense set in
? As I hope implying zero border
(C – constant).
19 July, 2016 at 2:25 am
gninrepoli
As I think “Natural density 0” (https://en.wikipedia.org/wiki/Natural_density) implies “Nowhere dense set” (https://en.wikipedia.org/wiki/Nowhere_dense_set). But I’m not sure.
22 July, 2016 at 11:08 am
gninrepoli
From
we have:


– countable infinity set



– nowhere dense set in
and
?
If the following conditions are satisfied:
For "natural density 0" I now know that this implies "nowhere dense set".
Now I see what I missed. You can set
represented as
?
P.S. Thank you :)
22 July, 2016 at 10:09 pm
gninrepoli
We must take
instead.
22 July, 2016 at 10:22 pm
gninrepoli
Also:
, then sort them in ascending at each step, and so on to infinity. Then the set
will be 
As I think you can take a cantor method for rational numbers in
23 July, 2016 at 6:30 am
gninrepoli
It turns out that the question is not as easy as I thought. (Can I sorting countably infinite set?) If there is a positive answer to this question then all converge.At the moment, all that I found on this subject https://www.irif.univ-paris-diderot.fr/~carton/Publications/Linear/Kleene/Scattered/jcss.pdf
24 July, 2016 at 12:31 am
gninrepoli
http://mathoverflow.net/questions/143454/sorting-of-countabe-set
Look at the last comment Joel David Hawkins. I think it makes sense.
26 July, 2016 at 6:11 am
gninrepoli
Unfortunately only if we apply the axiom of choice can be said
27 July, 2016 at 1:59 pm
gninrepoli
Applying the axiom of choice
After
At the moment, I have no proof, but there is an approach that I hope will lead to it:
Consider
, where
– finite set. It is easy to see that from it follows existence open interval
. I believe communication
at the end (limit to infinity) for the sets
and
is the same.
also exists an open interval
.
That is, for
27 July, 2016 at 10:05 pm
gninrepoli
I just want to write where it leads. This method can be applied to a large class of holomorphic functions, not only for
. For example for
: if we know that the number of non-trivial zeros
lies on
. We can say that there are no zeros on
,where C-constant.
27 July, 2016 at 10:23 pm
gninrepoli
Of course if the set of non-trivial zeros of the Riemann zeta function
is countable infinite set. 
28 July, 2016 at 1:19 am
gninrepoli
I’m not exactly sure. If we consider the asymptotic behavior, as the probability of occurrence of an element
at every step. Sending to infinity, the probability is zero. Here is an example from the natural numbers, what I mean:
At each step,
the probability of occurrence of the element is reduced.
Here the probability is also reduced, but at a different speed. But at the end for this two expressions we have a
.
exist for
.
I do not know how to rewrite in mathematical language. But it could prove the open interval
4 August, 2016 at 5:36 am
gninrepoli
Instead of the axiom of choice you can use Axiom of determinacy https://en.wikipedia.org/wiki/Axiom_of_determinacy. This means that we can be well ordered countable infinite set, but an uncountable set can not, for example.
4 August, 2016 at 11:05 am
gninrepoli
Now I am considering Axiom of countable choice https://en.wikipedia.org/wiki/Axiom_of_countable_choice as a substitute Axiom of determinacy. Because the half-closed interval
, taking any countable infinite set
in this interval, it
must contain an accumulation point
. But i don’t know.
7 August, 2016 at 12:42 am
gninrepoli
Question came with the axiom of choice. What is the “existence”? I called this “quantum existence”.I do not know how well I can explain, but I’ll try.I believe that this condition has long been used in mathematics. For example point have quantum existence. Of course I believe that the axiom of choice have quantum existence, but not yet fully. Take a countable infinite set
,
. Take one element from this set
, we do not know where he is. Let the probability of occurrence of this element tends to zero.
,
, where (C-constant).But if we try to define this constant we have nothing.This is the essence of quantum existence.That is, we want it to be determined by a Turing machine.But the existence of this interval is of the nature of quantum Turing machine.
it follows that there is an interval
11 August, 2016 at 5:21 am
gninrepoli
I do not know whether this is true. If we construct any Turing machine with the help of classical mechanics.And try to determine the point size for example.As I think we have nothing.Based on Bell’s theorem (https://en.wikipedia.org/wiki/Bell%27s_theorem), we can assume that the definition of the size of the point lies in the quantum mechanics.That’s why I call it a “quantum existence”.
16 August, 2016 at 3:14 am
gninrepoli
This does not apply to the problem.But I believe that the future of mathematics is a quantum mathematics.When we think intuitively we use Turing machine built with the help of classical or quantum mechanics.But somehow, we project it on a Turing machine built with the help of classical mechanics.I think if we start on the project and the Turing machine built with the help of quantum mechanics, most of the paradoxes disappear, and perhaps all.I think so and appear paradoxes.The main question that there is, in my opinion, how can it be proved by logic.