This is the third in a series of posts on the “no self-defeating object” argument in mathematics – a powerful and useful argument based on formalising the observation that any object or structure that is so powerful that it can “defeat” even itself, cannot actually exist.   This argument is used to establish many basic impossibility results in mathematics, such as Gödel’s theorem that it is impossible for any sufficiently sophisticated formal axiom system to prove its own consistency, Turing’s theorem that it is impossible for any sufficiently sophisticated programming language to solve its own halting problem, or Cantor’s theorem that it is impossible for any set to enumerate its own power set (and as a corollary, the natural numbers cannot enumerate the real numbers).

As remarked in the previous posts, many people who encounter these theorems can feel uneasy about their conclusions, and their method of proof; this seems to be particularly the case with regard to Cantor’s result that the reals are uncountable.   In the previous post in this series, I focused on one particular aspect of the standard proofs which one might be uncomfortable with, namely their counterfactual nature, and observed that many of these proofs can be largely (though not completely) converted to non-counterfactual form.  However, this does not fully dispel the sense that the conclusions of these theorems – that the reals are not countable, that the class of all sets is not itself a set, that truth cannot be captured by a predicate, that consistency is not provable, etc. – are highly unintuitive, and even objectionable to “common sense” in some cases.

How can intuition lead one to doubt the conclusions of these mathematical results?  I believe that one reason is because these results are sensitive to the amount of vagueness in one’s mental model of mathematics.  In the formal mathematical world, where every statement is either absolutely true or absolutely false with no middle ground, and all concepts require a precise definition (or at least a precise axiomatisation) before they can be used, then one can rigorously state and prove Cantor’s theorem, Gödel’s theorem, and all the other results mentioned in the previous posts without difficulty.  However, in the vague and fuzzy world of mathematical intuition, in which one’s impression of the truth or falsity of a statement may be influenced by recent mental reference points, definitions are malleable and blurry with no sharp dividing lines between what is and what is not covered by such definitions, and key mathematical objects may be incompletely specified and thus “moving targets” subject to interpretation, then one can argue with some degree of justification that the conclusions of the above results are incorrect; in the vague world, it seems quite plausible that one can always enumerate all the real numbers “that one needs to”, one can always justify the consistency of one’s reasoning system, one can reason using truth as if it were a predicate, and so forth.    The impossibility results only kick in once one tries to clear away the fog of vagueness and nail down all the definitions and mathematical statements precisely.  (To put it another way, the no-self-defeating object argument relies very much on the disconnected, definite, and absolute nature of the boolean truth space $\{\hbox{true},\hbox{ false}\}$ in the rigorous mathematical world.)

One can already see this with one of the most basic conclusions of the “no self-defeating object” argument, namely that the set of natural numbers is infinite.    Let me rephrase this result in the following manner:

Proposition 1. Let $A$ be a set of natural numbers with the following properties:

1. $0$ lies in $A$.
2. Whenever a natural number $n$ lies in $A$, then its successor $n+1$ also lies in $A$.

Then $A$ is infinite.

Indeed, from the principle of mathematical induction, the hypotheses of Proposition 1 force $A$ to be the entire set of natural numbers, and so Proposition 1 is logically equivalent to the assertion that the set of natural numbers is infinite.  (Here, infinite has its usual set theoretic meaning, i.e. the conclusion is that $A$ cannot be placed in bijection with a set of the form $\{1,\ldots,n\}$ for any natural number $n$.)

In the rigorous world of formal mathematics, Proposition 1 is of course uncontroversial, and is easily proven by a simple “no self-defeating object” argument:

Proof. Suppose for contradiction that $A$ was finite.  As $A$ is non-empty (it contains $0$), it must therefore have a largest element $n$ (as can be seen by a routine induction on the cardinality of $A$).  But  then by hypothesis, $n+1$ would also have to lie in $A$, and $n$ would thus need to be at least as large as $n+1$, a contradiction.  $\Box$

But if one allows for vagueness in how one specifies the set $A$, then Proposition 1 can seem to be false, as observed by the ancient Greeks with the sorites paradox.  To use their original example, consider the question of how many grains of sand are required to make a heap of sand.  One or zero grains of sand are certainly not sufficient, but clearly if one places enough grains of sand together, one would make a heap.  If one then “defines” $A$ to be the set of all natural numbers $n$ such that $n$ grains of sand are not sufficient to make a heap,  then it is intuitively plausible that $A$ obeys both Hypothesis 1 and Hypothesis 2 of the above proposition, since it is intuitively clear that adding a single grain of sand to a non-heap cannot convert it to a heap.  On the other hand, it is just as clear that the set $A$ is finite, thus providing a counterexample to Proposition 1.

The problem here is the vagueness inherent in the notion of a “heap”.  Given a pile of sand $P$, the question “Is $P$ a heap?” does not have an absolute truth value; what may seem like a heap to one observer may not be so to another.  Furthermore, what may seem like a heap to one observer when presented in one context, may not look like a heap to the same observer when presented in another context; for instance, a large pile of sand that was slowly accumulated over time may not seem as large as a pile of sand that suddenly appeared in front of the observer, even if both piles of sand were of identical size in absolute terms.  (The well-known (though technically inaccurate) boiling frog metaphor is a particularly graphic way of depicting this phenomenon, which ultimately arises from the fact that people usually do not judge quantities in absolute terms, but instead by using relative measurements that compare that quantity to nearby reference quantities.)

There are many modern variants of the sorites paradox that exploit vagueness of definition to obtain conclusions that apparently contradict rigorous statements such as Proposition 1.  One of them is the interesting number paradox.  Let $A$ be the set of all natural numbers that are “interesting”, in the sense that they have some unusual defining feature (for instance, 561 is the first composite Carmichael number and is thus presumably an interesting number).  Then $A$ is presumably finite, but the first natural number that does not lie in $A$ is presumably also an interesting number, thus contradicting the definition of $A$.  (Here, the variant of Proposition 1 that is apparently being contradicted here is the well-ordering principle.)

Of course, just as the sorites paradox can be diagnosed as arising from the vagueness of the term “heap”, the interesting number paradox can be diagnosed as arising from the vagueness of the term “interesting”.  But one can create a functionally similar paradox that, at first glance, eliminates this vagueness, namely the Berry paradox.  Let $A$ denote the set of all natural numbers that can be defined in fewer than sixteen words in the English language; thus, again 561 can be defined as “The first composite Carmichael number” and thus belongs in $A$.  As there are only finitely many sentences that one can form with fewer than sixteen words of the English language, $A$ is clearly finite.  But the first natural number that does not lie in $A$ can be described as “The first natural number not definable in fewer than sixteen words in the English language”, which is a definition in fewer than sixteen words in the English language, and thus lies in $A$, again providing another seeming contradiction to the well-ordering principle.

Here, the vagueness is a bit harder to spot, and it comes from the use of “the English language”, which is a vague and mutable concept that allows for self-reference and is in fact technically inconsistent if one tries to interpret it naively (as can be seen from any number of linguistic paradoxes, starting with the classic liar paradox).  To make things more precise, one can try to replace the English language here by a formal mathematical language, e.g. the language of true arithmetic.   Let us take one such formal language and call it $L$.  Then one can certainly define the set $A_L$ of all natural numbers that can be definable in fewer than sixteen words in the language $L$, and one can form the natural number $n_L$, defined as “The first natural number not definable in fewer than sixteen words in the language L”.  This is a natural number that is definable in fewer than sixteen words – but not in the language $L$, but rather in a meta-language $L'$ that, among other things, is able to interpret what it means for a sentence in $L$ to define a natural number.  This requires a truth predicate for $L$, and Tarski’s indefinability theorem asserts that such a predicate cannot exist inside $L$ itself.  Indeed, one can interpret Berry’s paradox as providing a proof of Tarski’s theorem. Thus we see that a non-trivial mathematical theorem (in this case, Tarski’s theorem) emerges when one finally clears away all the vagueness from a sorites-type paradox.  (If instead one replaced the word “definable” by “provably definable”, and used a formal axiom system $L$ as one’s language, one could similarly deduce Gödel’s incompleteness theorem from this argument.)

Similar paradoxes come up when trying to enumerate the real numbers, or subsets of real numbers.  Here, the analogue of Proposition 1 is

Proposition 2. Let $A$ be a set of real numbers with the following properties:

1. $A$ is infinite.
2. Given any sequence $x_1, x_2, x_3, \ldots$ of elements of $A$, one can find another element $y$ of $A$ that is not equal to any of the elements of the sequence (i.e. $y \neq x_n$ for every positive integer $n$).

Then $A$ is uncountable.

(Of course, uncountability has the usual set-theoretic definition, namely that the infinite set $A$ is not in one-to-one correspondence with the natural numbers.)

Again, in the rigorous world in which all terms are clearly defined, this proposition is easily proven:

Proof. Suppose for sake of contradiction that $A$ was countable.  Then, being infinite, it could be enumerated by a sequence $x_1, x_2, x_3, \ldots$ of real numbers in $A$; but then by hypothesis #2, there would then be another real number $y$ that was also in $A$, but distinct from all of the elements of the sequence, contradicting the fact that this was an enumeration. $\Box$

Since Cantor’s diagonal argument demonstrates that the set of reals ${\bf R}$ itself obeys hypothesis #2 (and also clearly obeys hypothesis #1), we conclude as a corollary that the reals are uncountable.

However, one can find apparent counterexamples to Proposition 2 if one deploys enough vagueness.  For instance, in the spirit of the Berry paradox, one could try setting $A$ to be the set of all “definable” real numbers – those numbers that can be defined using a finite sentence in the English language.  As there are only countably many such sentences, $A$ would be countable; it is also clearly infinite.  But, on the other hand, one could take any sequence of real numbers $x_1, x_2, x_3, \ldots$ in $A$ and apply the diagonal argument to define another real number that does not lie in that sequence.

Again, this apparent contradiction becomes clarified if one removes the vagueness, by replacing “the English language” with a formal language $L$.  For instance, one could take $L$ to be true arithmetic, and $A_L$ to be the set of all real numbers that are definable by a sentence in $L$.  Then $A_L$ is indeed countable, but any enumeration of $A_L$ requires the truth predicate for $L$ and thus such an enumeration (as well as the Cantor diagonalisation thereof) cannot be defined in $L$, but only in a meta-language $L'$ – thus obtaining yet another proof of Tarski’s indefinability theorem.  Or, one could take $L$ to be a programming language, such as the language of Turing machines; then again the set $A_L$ of real numbers whose decimal expansion can be given as the output of an algorithm in $L$ is countable, but to enumerate it one would have to solve the halting problem for $L$, which requires a meta-language $L'$ that is distinct from $L$ itself; thus in this case the diagonalisation argument gives a proof of Turing’s halting theorem.

It is also instructive to contrast the formal distinction between countable and uncountable sets with the sorites paradox distinction between a non-heap and a heap of sand.  Intuitively, the act of adding one grain of sand to a non-heap cannot directly turn that non-heap into a heap, and yet Proposition 1 forces this to eventually be the case.  Similarly, it is intuitively clear that act of producing a single real number not on one’s countable enumeration of a set $A$ does not directly convert a countable set to become uncountable, and yet Proposition 2 forces this to eventually be the case.    (More formally, Proposition 1 is powered by the principle of mathematical induction, which allows one to iterate the $n \to n+1$ operation indefinitely (or more precisely, up to the first infinite ordinal $\omega$) to explicitly achieve infinite cardinality; in a similar spirit, one can interpret Proposition 2 as being powered by transfinite induction, in the sense that one can iterate the diagonalisation operation indefinitely (or more precisely, up to the first uncountable ordinal $\omega_1$) to explicitly achieve uncountable cardinality.  The transition from countability to uncountability does not occur during the successor ordinal steps of this transfinite induction, but during the (final) limit ordinal step.)

One can perform a similar analysis for the other results discussed previously in this series.  For instance, Russell’s paradox tells us, among other things, that (assuming the standard Zermelo-Frankel axioms of set theory) the class of all sets cannot itself be a set.  Actually we have the following slightly more general statement, analogous to Proposition 1 or Proposition 2:

Proposition 3. Let ${\mathcal C}$ be a class of sets with the following properties:

1. The empty set $\emptyset$ belongs to ${\mathcal C}$.
2. If a set $A$ belongs to ${\mathcal C}$, then the singleton set $\{A\}$ also belongs to ${\mathcal C}$.
3. If any collection $\{ A_\beta: \beta \in B\}$ of sets $A_\beta$, indexed by another set $B$, is such that each $A_\beta$ lies in ${\mathcal C}$, then their union $\bigcup_{\beta \in B} A_\beta$ also lies in ${\mathcal C}$.

Then ${\mathcal C}$ is not a set.

(Hypothesis #1 is actually redundant, being implied by the trivial case of Hypothesis #3 when $B$ is empty, but we keep it in order to emphasise the similarity with Propositions 1 and 2.)  The Zermelo-Frankel axioms (as well as one’s intuition from naive set theory) tell us that the class of all sets obeys Hypotheses #1, #2, #3, and so cannot be a set.  (In fact, the above proposition is essentially equivalent to the slightly stronger assertion that the class of all well-founded sets, also known as the von Neumann universe $V$, is not a set.)  A key subtlety is that the set $\{ A_\beta: \beta \in B\}$ itself is not required to lie in ${\mathcal C}$.  If one imposes such a restriction, then ${\mathcal C}$ can become a set (and by adding a few additional axioms, such sets become the same concept as Grothendieck universes).

Proof. Suppose for contradiction that ${\mathcal C}$ is a set.  Using the axiom (schema) of specification, the set $B := \{ A \in {\mathcal C}: A \not \in A \}$ is then a set.  It is the union of all the singleton sets $\{A\}$ with $A \in B$, so by Hypotheses #2 and #3, $B \in {\mathcal C}$.  But then we see that $B \in B$ if and only if $B \not \in B$, which is absurd.  $\Box$

Again, the hypotheses in this proposition seem innocuous, much like adding a single grain of sand to a non-heap of sand; but if one iterates them indefinitely then one eventually ends up with a class so large that it is no longer a set.  (Here, the transfinite induction is not up to the first infinite ordinal or the first uncountable ordinal, but rather across the class of all ordinals; the point then being that the class of all ordinals is itself not a set.)  Naive set theory intuition seems to be in contradiction to Proposition 3, but this is only due to the vagueness inherent in that concept of set theory.   One can also try analysing Berry-type paradoxes in this setting, for instance working only with “constructible” sets (i.e. elements of Gödel’s universe $L$); the consequence one gets from this is that the class of constructible sets  is not itself constructible (in fact, it is not even a set, as it contains all the ordinals).

Proposition 3 may seem only of interest to set theorists and logicians, but if one makes a tiny modification of it, by replacing a class of sets with a partially ordered class, then one gets a very important result:

Lemma 4. (Zorn’s lemma)  Let ${\mathcal C}$ be a partially ordered class which obeys the following properties:

1. At least one element belongs to ${\mathcal C}$.
2. If $x$ is an element of ${\mathcal C}$, then there is another element $y$ of ${\mathcal C}$ such that $y > x$.
3. If $(x_\beta)_{\beta \in B}$ is a totally ordered set in ${\mathcal C}$, indexed by another set $B$, then there exists an element $y \in {\mathcal C}$ such that $y \geq x_\beta$ for all $\beta \in B$.

Then ${\mathcal C}$ is not a set.

This lemma (which, of course, requires the axiom of choice to prove, and is in fact equivalent to this axiom) is usually phrased in the contrapositive form: any non-empty set ${\mathcal C}$ for which every totally ordered set has an upper bound, has a maximal element.  However when phrased in the above form, we see the close similarity between Zorn’s lemma and Propositions 1, 2, and 3.  In this form, it can be used to demonstrate that many standard classes (e.g. the class of vector spaces, the class of groups, the class of ordinals, etc.) are not sets, despite the fact that each of the hypotheses in the lemma do not directly seem to take one from being a set to not being a set.  This is only an apparent contradiction if one’s notion of sets is vague enough to accommodate sorites-type paradoxes.

More generally, many of the objects demonstrated to be impossible in the previous posts in this series can appear possible as long as there is enough vagueness.  For instance, one can certainly imagine an omnipotent being provided that there is enough vagueness in the concept of what “omnipotence” means; but if one tries to nail this concept down precisely, one gets hit by the omnipotence paradox.  Similarly, one can imagine a foolproof strategy for beating the stock market (or some other zero sum game), as long as the strategy is vague enough that one cannot analyse what happens when that strategy ends up being used against itself.  Or, one can imagine the possibility of time travel as long as it is left vague what would happen if one tried to trigger the grandfather paradox.  And so forth.  The “self-defeating” aspect of these impossibility results relies heavily on precision and definiteness, which is why they can seem so strange from the perspective of vague intuition.