You are currently browsing the tag archive for the ‘set theory’ tag.

This is the third in a series of posts on the “no self-defeating object” argument in mathematics – a powerful and useful argument based on formalising the observation that any object or structure that is so powerful that it can “defeat” even itself, cannot actually exist.   This argument is used to establish many basic impossibility results in mathematics, such as Gödel’s theorem that it is impossible for any sufficiently sophisticated formal axiom system to prove its own consistency, Turing’s theorem that it is impossible for any sufficiently sophisticated programming language to solve its own halting problem, or Cantor’s theorem that it is impossible for any set to enumerate its own power set (and as a corollary, the natural numbers cannot enumerate the real numbers).

As remarked in the previous posts, many people who encounter these theorems can feel uneasy about their conclusions, and their method of proof; this seems to be particularly the case with regard to Cantor’s result that the reals are uncountable.   In the previous post in this series, I focused on one particular aspect of the standard proofs which one might be uncomfortable with, namely their counterfactual nature, and observed that many of these proofs can be largely (though not completely) converted to non-counterfactual form.  However, this does not fully dispel the sense that the conclusions of these theorems – that the reals are not countable, that the class of all sets is not itself a set, that truth cannot be captured by a predicate, that consistency is not provable, etc. – are highly unintuitive, and even objectionable to “common sense” in some cases.

How can intuition lead one to doubt the conclusions of these mathematical results?  I believe that one reason is because these results are sensitive to the amount of vagueness in one’s mental model of mathematics.  In the formal mathematical world, where every statement is either absolutely true or absolutely false with no middle ground, and all concepts require a precise definition (or at least a precise axiomatisation) before they can be used, then one can rigorously state and prove Cantor’s theorem, Gödel’s theorem, and all the other results mentioned in the previous posts without difficulty.  However, in the vague and fuzzy world of mathematical intuition, in which one’s impression of the truth or falsity of a statement may be influenced by recent mental reference points, definitions are malleable and blurry with no sharp dividing lines between what is and what is not covered by such definitions, and key mathematical objects may be incompletely specified and thus “moving targets” subject to interpretation, then one can argue with some degree of justification that the conclusions of the above results are incorrect; in the vague world, it seems quite plausible that one can always enumerate all the real numbers “that one needs to”, one can always justify the consistency of one’s reasoning system, one can reason using truth as if it were a predicate, and so forth.    The impossibility results only kick in once one tries to clear away the fog of vagueness and nail down all the definitions and mathematical statements precisely.  (To put it another way, the no-self-defeating object argument relies very much on the disconnected, definite, and absolute nature of the boolean truth space $\{\hbox{true},\hbox{ false}\}$ in the rigorous mathematical world.)

The standard modern foundation of mathematics is constructed using set theory. With these foundations, the mathematical universe of objects one studies contains not only the “primitive” mathematical objects such as numbers and points, but also sets of these objects, sets of sets of objects, and so forth. (In a pure set theory, the primitive objects would themselves be sets as well; this is useful for studying the foundations of mathematics, but for most mathematical purposes it is more convenient, and less conceptually confusing, to refrain from modeling primitive objects as sets.) One has to carefully impose a suitable collection of axioms on these sets, in order to avoid paradoxes such as Russell’s paradox; but with a standard axiom system such as Zermelo-Fraenkel-Choice (ZFC), all actual paradoxes that we know of are eliminated. Still, one might be somewhat unnerved by the presence in set theory of statements which, while not genuinely paradoxical in a strict sense, are still highly unintuitive; Cantor’s theorem on the uncountability of the reals, and the Banach-Tarski paradox, are perhaps the two most familiar examples of this.

One may suspect that the reason for this unintuitive behaviour is the presence of infinite sets in one’s mathematical universe. After all, if one deals solely with finite sets, then there is no need to distinguish between countable and uncountable infinities, and Banach-Tarski type paradoxes cannot occur.

On the other hand, many statements in infinitary mathematics can be reformulated into equivalent statements in finitary mathematics (involving only finitely many points or numbers, etc.); I have explored this theme in a number of previous blog posts. So, one may ask: what is the finitary analogue of statements such as Cantor’s theorem or the Banach-Tarski paradox?

The finitary analogue of Cantor’s theorem is well-known: it is the assertion that ${2^n > n}$ for every natural number ${n}$, or equivalently that the power set of a finite set ${A}$ of ${n}$ elements cannot be enumerated by ${A}$ itself. Though this is not quite the end of the story; after all, one also has ${n+1 > n}$ for every natural number ${n}$, or equivalently that the union ${A \cup \{a\}}$ of a finite set ${A}$ and an additional element ${a}$ cannot be enumerated by ${A}$ itself, but the former statement extends to the infinite case, while the latter one does not. What causes these two outcomes to be distinct?

On the other hand, it is less obvious what the finitary version of the Banach-Tarski paradox is. Note that this paradox is available only in three and higher dimensions, but not in one or two dimensions; so presumably a finitary analogue of this paradox should also make the same distinction between low and high dimensions.

I therefore set myself the exercise of trying to phrase Cantor’s theorem and the Banach-Tarski paradox in a more “finitary” language. It seems that the easiest way to accomplish this is to avoid the use of set theory, and replace sets by some other concept. Taking inspiration from theoretical computer science, I decided to replace concepts such as functions and sets by the concepts of algorithms and oracles instead, with various constructions in set theory being replaced instead by computer language pseudocode. The point of doing this is that one can now add a new parameter to the universe, namely the amount of computational resources one is willing to allow one’s algorithms to use. At one extreme, one can enforce a “strict finitist” viewpoint where the total computational resources available (time and memory) are bounded by some numerical constant, such as ${10^{100}}$; roughly speaking, this causes any mathematical construction to break down once its complexity exceeds this number. Or one can take the slightly more permissive “finitist” or “constructivist” viewpoint, where any finite amount of computational resource is permitted; or one can then move up to allowing any construction indexed by a countable ordinal, or the storage of any array of countable size. Finally one can allow constructions indexed by arbitrary ordinals (i.e. transfinite induction) and arrays of arbitrary infinite size, at which point the theory becomes more or less indistinguishable from standard set theory.

I describe this viewpoint, and how statements such as Cantor’s theorem and Banach-Tarski are interpreted with this viewpoint, below the fold. I should caution that this is a conceptual exercise rather than a rigorous one; I have not attempted to formalise these notions to the same extent that set theory is formalised. Thus, for instance, I have no explicit system of axioms that algorithms and oracles are supposed to obey. Of course, these formal issues have been explored in great depth by logicians over the past century or so, but I do not wish to focus on these topics in this post.

A second caveat is that the actual semantic content of this post is going to be extremely low. I am not going to provide any genuinely new proof of Cantor’s theorem, or give a new construction of Banach-Tarski type; instead, I will be reformulating the standard proofs and constructions in a different language. Nevertheless I believe this viewpoint is somewhat clarifying as to the nature of these paradoxes, and as to how they are not as fundamentally tied to the nature of sets or the nature of infinity as one might first expect.

Notational convention: As in Notes 2, I will colour a statement red in this post if it assumes the axiom of choice.  We will, of course, rely on every other axiom of Zermelo-Frankel set theory here (and in the rest of the course).  $\diamond$

In this course we will often need to iterate some sort of operation “infinitely many times” (e.g. to create a infinite basis by choosing one basis element at a time).  In order to do this rigorously, we will rely on Zorn’s lemma:

Zorn’s Lemma. Let $(X, \leq)$ be a non-empty partially ordered set, with the property that every chain (i.e. a totally ordered set) in X has an upper bound.  Then X contains a maximal element (i.e. an element with no larger element).

Indeed, we have used this lemma several times already in previous notes.  Given the other standard axioms of set theory, this lemma is logically equivalent to

Axiom of choice. Let X be a set, and let ${\mathcal F}$ be a collection of non-empty subsets of X.  Then there exists a choice function $f: {\mathcal F} \to X$, i.e. a function such that $f(A) \in A$ for all $A \in {\mathcal F}$.

One implication is easy:

Proof of axiom of choice using Zorn’s lemma. Define a partial choice function to be a pair $({\mathcal F}', f')$, where ${\mathcal F}'$ is a subset of ${\mathcal F}$ and $f': {\mathcal F}' \to X$ is a choice function for ${\mathcal F'}$.  We can partially order the collection of partial choice functions by writing $({\mathcal F}', f') \leq ({\mathcal F}'', f'')$ if ${\mathcal F}' \subset {\mathcal F}''$ and f” extends f’.  The collection of partial choice functions is non-empty (since it contains the pair $(\emptyset, ())$ consisting of the empty set and the empty function), and it is easy to see that any chain of partial choice functions has an upper bound (formed by gluing all the partial choices together).  Hence, by Zorn’s lemma, there is a maximal partial choice function $({\mathcal F}_*, f_*)$.  But the domain ${\mathcal F}_*$ of this function must be all of ${\mathcal F}$, since otherwise one could enlarge ${\mathcal F}_*$ by a single set A and extend $f_*$ to A by choosing a single element of A.  (One does not need the axiom of choice to make a single choice, or finitely many choices; it is only when making infinitely many choices that the axiom becomes necessary.)  The claim follows. $\Box$

In the rest of these notes I would like to supply the reverse implication, using the machinery of well-ordered sets.  Instead of giving the shortest or slickest proof of Zorn’s lemma here, I would like to take the opportunity to place the lemma in the context of several related topics, such as ordinals and transfinite induction, noting that much of this material is in fact independent of the axiom of choice.  The material here is standard, but for the purposes of this course one may simply take Zorn’s lemma as a “black box” and not worry about the proof, so this material is optional.