I recently reposted my favourite logic puzzle, namely the blue-eyed islander puzzle. I am fond of this puzzle because in order to properly understand the correct solution (and to properly understand why the alternative solution is incorrect), one has to think very clearly (but unintuitively) about the nature of knowledge.
There is however an additional subtlety to the puzzle that was pointed out in comments, in that the correct solution to the puzzle has two components, a (necessary) upper bound and a (possible) lower bound (I’ll explain this further below the fold, in order to avoid blatantly spoiling the puzzle here). Only the upper bound is correctly explained in the puzzle (and even then, there are some slight inaccuracies, as will be discussed below). The lower bound, however, is substantially more difficult to establish, in part because the bound is merely possible and not necessary. Ultimately, this is because to demonstrate the upper bound, one merely has to show that a certain statement is logically deducible from an islander’s state of knowledge, which can be done by presenting an appropriate chain of logical deductions. But to demonstrate the lower bound, one needs to show that certain statements are not logically deducible from an islander’s state of knowledge, which is much harder, as one has to rule out all possible chains of deductive reasoning from arriving at this particular conclusion. In fact, to rigorously establish such impossiblity statements, one ends up having to leave the “syntactic” side of logic (deductive reasoning), and move instead to the dual “semantic” side of logic (creation of models). As we shall see, semantics requires substantially more mathematical setup than syntax, and the demonstration of the lower bound will therefore be much lengthier than that of the upper bound.
To complicate things further, the particular logic that is used in the blue-eyed islander puzzle is not the same as the logics that are commonly used in mathematics, namely propositional logic and first-order logic. Because the logical reasoning here depends so crucially on the concept of knowledge, one must work instead with an epistemic logic (or more precisely, an epistemic modal logic) which can properly work with, and model, the knowledge of various agents. To add even more complication, the role of time is also important (an islander may not know a certain fact on one day, but learn it on the next day), so one also needs to incorporate the language of temporal logic in order to fully model the situation. This makes both the syntax and semantics of the logic quite intricate; to see this, one only needs to contemplate the task of programming a computer with enough epistemic and temporal deductive reasoning powers that it would be able to solve the islander puzzle (or even a smaller version thereof, say with just three or four islanders) without being deliberately “fed” the solution. (The fact, therefore, that humans can grasp the correct solution without any formal logical training is therefore quite remarkable.)
As difficult as the syntax of temporal epistemic modal logic is, though, the semantics is more intricate still. For instance, it turns out that in order to completely model the epistemic state of a finite number of agents (such as 1000 islanders), one requires an infinite model, due to the existence of arbitrarily long nested chains of knowledge (e.g. “ knows that knows that knows that has blue eyes”), which cannot be automatically reduced to shorter chains of knowledge. Furthermore, because each agent has only an incomplete knowledge of the world, one must take into account multiple hypothetical worlds, which differ from the real world but which are considered to be possible worlds by one or more agents, thus introducing modality into the logic. More subtly, one must also consider worlds which each agent knows to be impossible, but are not commonly known to be impossible, so that (for instance) one agent is willing to admit the possibility that another agent considers that world to be possible; it is the consideration of such worlds which is crucial to the resolution of the blue-eyed islander puzzle. And this is even before one adds the temporal aspect (e.g. “On Tuesday, knows that on Monday, knew that by Wednesday, will know that has blue eyes”).
Despite all this fearsome complexity, it is still possible to set up both the syntax and semantics of temporal epistemic modal logic in such a way that one can formulate the blue-eyed islander problem rigorously, and in such a way that one has both an upper and a lower bound in the solution. The purpose of this post is to construct such a setup and to explain the lower bound in particular. The same logic is also useful for analysing another well-known paradox, the unexpected hanging paradox, and I will do so at the end of the post. Note though that there is more than one way to set up epistemic logics, and they are not all equivalent to each other.
(On the other hand, for puzzles such as the islander puzzle in which there are only a finite number of atomic propositions and no free variables, one at least can avoid the need to admit predicate logic, in which one has to discuss quantifiers such as and . A fully formed predicate temporal epistemic modal logic would indeed be of terrifying complexity.)
Our approach here will be a little different from the approach commonly found in the epistemic logic literature, in which one jumps straight to “arbitrary-order epistemic logic” in which arbitrarily long nested chains of knowledge (“ knows that knows that knows that \ldots”) are allowed. Instead, we will adopt a hierarchical approach, recursively defining for a “-order epistemic logic” in which knowledge chains of depth up to , but no greater, are permitted. The arbitrarily order epistemic logic is then obtained as a limit (a direct limit on the syntactic side, and an inverse limit on the semantic side, which is dual to the syntactic side) of the finite order epistemic logics.
I should warn that this is going to be a rather formal and mathematical post. Readers who simply want to know the answer to the islander puzzle would probably be better off reading the discussion at the puzzle’s own blog post instead.
— 1. Zeroth-order logic —
Before we plunge into the full complexity of epistemic logic (or temporal epistemic logic), let us first discuss formal logic in general, and then focus on a particularly simple example of a logic, namely zeroth order logic (better known as propositional logic). This logic will end up forming the foundation for a hierarchy of epistemic logics, which will be needed to model such logic puzzles as the blue-eyed islander puzzle.
Informally, a logic consists of three inter-related components:
- A language. This describes the type of sentences the logic is able to discuss.
- A syntax. This describes the rules by which the logic can deduce conclusions (from given hypotheses).
- A semantics. This describes the sentences which the logic interprets to be true (in given models).
A little more formally:
- A language is a set of sentences, which are certain strings of symbols from a fixed alphabet, that are generated by some rules of grammar.
- A syntax is a collection of inference rules for generating deductions of the form (which we read as “From , we can deduce ” or “ is a consequence of “), where and are sentences in (or sets of sentences in ).
- A semantics describes what a model (or interpretation, or structure, or world) of the logic is, and defines what it means for a sentence in (or a collection of sentences) to be true in such a model (which we write as , and we read as “ models “, “ obeys “, or “ is true in “).
We will abuse notation a little bit and use the language as a metonym for the entire logic; strictly speaking, the logic should be a tuple consisting of the language, syntax, and semantics, but this leads to very unwieldy notation.
The syntax and semantics are dual to each other in many ways; for instance, the syntax can be used to show that certain statements can be proved, while the semantics can be used to show that certain statements cannot be proved. This distinction will be particularly important in the blue-eyed islander puzzle; in order to show that all blue-eyed islanders commit suicide by the 100th day, one can argue purely on syntactical grounds; but to show that it is possible for the blue-eyed islanders to not commit suicide on the 99th day or any preceding day, one must instead use semantic methods.
To illustrate the interplay between language, syntax, and semantics, we begin with the simple example of propositional logic. To describe this logic, one must first begin with some collection of atomic propositions. For instance, on an island with three islanders , one could consider the propositional logic generated by three atomic propositions , where each is intended to model the statement that has blue eyes.
One can have either a finite or an infinite set of atomic propositions. In this discussion, it will suffice to consider the situation in which there are only finitely many atomic propositions, but one can certainly also study logics with infinitely many such propositions.
The language would then consist of all the sentences that can be formed from the atomic propositions using the usual logical connectives (, , , , , , etc.) and parentheses, according to the usual rules of logical grammar (which consists of rules such as “If and are sentences in , then is also a sentence in “). For instance, if are atomic propositions, then
would be an example of a sentence in . On the other hand,
is not a sentence in , despite being a juxtaposition of atomic propositions, connectives, and parentheses, because it is not built up from rules of grammar.
One could certainly write down a finite list of all the rules of grammar for propositional calculus (as is done for instance at this Wikipedia page), but we will not do so here in order not to disrupt the flow of discussion.
It is customary to abuse notation slightly and omit parentheses when they are redundant (or when there is enough associativity present that the precise placement of parentheses are not relevant). For instance, could be abbreviated as . We will adopt this type of convention in order to keep the exposition as uncluttered as possible.
Now we turn to the syntax of propositional logic. This syntax is generated by basic rules of deductive logic, such as modus ponens
or the law of the excluded middle
and completed by transitivity (if and , then ), monotonicity (), and concatenation (if and then ). (Here we adopt the usual convention of representing a set of sentences without using the usual curly braces, instead relying purely on the comma separator.) Another convenient inference rule to place in this logic is the deduction theorem: if , then one can infer . (In propositional logic (or predicate logic), this rule is redundant (hence the designation of this rule as a theorem), but for the epistemic logics below, it will be convenient to make deduction an explicit inference rule, as it simplifies the other inference rules one will have to add to the system.)
A typical deduction that comes from this syntax is
which using the blue-eyed islander interpretation, is the formalisation of the assertion that given that at least one of the islanders has blue eyes, and that do not have blue eyes, one can deduce that has blue eyes.
As with the laws of grammar, one can certainly write down a finite list of inference rules in propositional calculus; one such list for instance is given at this Wikipedia page. Note though that, much as a given vector space has more than one set of generators, there is more than one possible list of inference rules for propositional calculus, due to some rules being equivalent to, or at least deducible from, other rules; the precise choice of basic inference rules is to some extent a matter of personal taste and will not be terribly relevant for the current discussion.
Finally, we discuss the semantics of propositional logic. For this particular logic, the models are described by truth assignments, that assign a truth value to each atomic statement . Once a truth value to each atomic statement is assigned, the truth value of any other sentence in the propositional logic generated by these atomic statements can then be interpreted using the usual truth tables. For instance, returning to the islander example, consider a model in which is true, but and are false; informally, describes a hypothetical world in which has blue eyes but and do not have blue eyes. Then the sentence is true in ,
but the statement is false in ,
If is a set of sentences, we say that models if models each sentence in . Thus for instance, if we continue the preceding example, then
Note that if there are only finitely many atomic statements , then there are only finitely many distinct models of the resulting propositional logic; in fact, there are exactly such models, one for each truth assignment. We will denote the space of all possible models of a language as .
If one likes, one can designate one of these models to be the “real” world so that all the other models become purely hypothetical worlds. In the setting of propositional logic, the hypothetical worlds then have no direct bearing on the real world; the fact that a sentence is true or false in a hypothetical world does not say anything about what sentences are true or false in . However, when we turn to epistemic logics later in this post, we will see that hypothetical worlds will play an important role in the real world, because such worlds may be considered to be possible worlds by one or more agents (or, an agent may consider it possible that another agent considers the world to be possible, and so forth.).
The syntatical and semantic sides of propositional logic are tied together by two fundamental facts:
Theorem 1 (Soundness and completeness) Let be a propositional logic, and let be a set of sentences in , and let be another sentence in .
- (Soundness) If , then every model which obeys , also obeys (i.e. implies ).
- (Completeness) If every model that obeys , also obeys , then .
Soundness is easy to prove; one merely needs to verify that each of the inference rules in one’s syntax is valid, in that models that obey , automatically obey . This boils down to some tedious inspection of truth tables. (The soundness of the deduction theorem is a little trickier to prove, but one can achieve this by an induction on the number of times this theorem is invoked in a given induction.) Completeness is a bit more difficult to establish; this claim is in fact a special case of the Gödel completeness theorem, and is discussed in this previous blog post; we also sketch a proof of completeness below.
By taking the contrapositive of soundness, we have the following important corollary: if we can find a model which obeys but does not obey , then it must not be possible to deduce as a logical consequence of : . Thus, we can use semantics to demonstrate limitations in syntax.
For instance, consider a truth assignment in which is true but is false. Then , but . This demonstrates that
thus an implication such as does not entail its converse .
A theory (or more precisely, a deductive theory) in a logic , is a set of sentences in which is closed under deductive consequence, thus if for some sentence in , then . Given a theory , one can associate the set
of all possible worlds (or models) in which that theory is true; conversely, given a set of such models, one can form the theory
of sentences which are true in all models in . If the logic is both sound and complete, these operations invert each other: given any theory , we have , and given any set of models, is a theory and . Thus there is a one-to-one correspondence between theories and sets of possible worlds in a sound complete language .
For instance, in our running example, if is the theory generated by the three statements , , and , then consists of precisely two worlds; one in which are true and is false, and one in which is true and are false. Since neither of or are true in both worlds in , neither of or lie in . Thus, it is not possible to deduce either of or from the hypotheses , , and . More informally, if one knows that there is at least one blue-eyed islander, that has blue eyes, and does not have blue eyes, this is not enough information to determine whether has blue eyes or not.
One can use theories to prove the completeness theorem. Roughly speaking, one can argue by taking the contrapositive. Suppose that , then we can find a theory which contains all sentences in , but does not contain . In this finite setting, we can easily pass to a maximal such theory (with respect to set inclusion); one then easily verifies that this theory is complete in the sense that for any given sentence , exactly one of and is true. From this complete theory one can then directly build a model which obeys but does not obey , giving the desired claim.
— 2. First-order epistemic logic —
Having reviewed propositional logic (which we will view as the zeroth-order iteration of epistemic logic), we now turn to the first non-trivial example of epistemic logic, which we shall call first-order epistemic logic (which should not be confused with the more familiar first-order predicate logic). Roughly speaking, first-order epistemic logic is like zeroth-order logic, except that there are now also some knowledge agents that are able to know certain facts in zeroth-order logic (e.g. an islander may know that the islander has blue eyes). However, in this logic one cannot yet express higher-order facts (e.g. we will not yet be able to formulate a sentence to the effect that knows that knows that has blue eyes). This will require a second-order or higher epistemic logic, which we will discuss later in this post.
Let us now formally construct this logic. As with zeroth-order logic, we will need a certain set of atomic propositions, which for simplicity we will assume to be a finite set . This already gives the zeroth order language of sentences that one can form from the by the rules of propositional grammar. For instance,
is a sentence in . The zeroth-order logic also comes with a notion of inference and a notion of modeling , which we now subscript by in order to distinguish it from the first-order notions of inference and modeling which we will define shortly. Thus, for instance
and if is a truth assignment for for which are all true, then
We will also assume the existence of a finite number of knowledge agents , each of which are capable of knowing sentences in the zeroth order language . (In the case of the islander puzzle, and ignoring for now the time aspect of the puzzle, each islander generates one knowledge agent , representing the state of knowledge of at a fixed point in time. Later on, when we add in the temporal aspect to the puzzle, we will need different knowledge agents for a single islander at different points in time, but let us ignore this issue for now.) To formalise this, we define the first-order language to be the language generated from and the rules of propositional grammar by imposing one additional rule:
- If is a sentence in , and is a knowledge agent, then is a sentence in (which can informally be read as “ knows (or believes) to be true”).
Thus, for instance,
is a sentence in ; in the islander interpretation, this sentence denotes the assertion that knows to have blue eyes, and knows that at least one islander has blue eyes, but does not have blue eyes. On the other hand,
is not a sentence in , because is not a sentence in . (However, we will be able to interpret in the second-order epistemic language that we will define later.)
Similarly, if are sentences in such that , then one automatically has .
However, we would like to add some additional inference rules to reflect our understanding of what “knowledge” means. One has some choice in deciding what rules to lay down here, but we will only add one rule, which informally reflects the assertion that “all knowledge agents are highly logical”:
- First-order epistemic inference rule: If are sentences such that
and is a knowledge agent, then
(We will introduce higher order epistemic inference rules when we turn to higher order epistemic logics.)
Informally speaking, the epistemic inference rule asserts that if can be deduced from , and knows to be true, then must also know to be true. For instance, since modus ponens gives us the inference
we therefore have, by the first-order epistemic inference rule,
(note how this is different from (1) – why?).
Another example of more relevance to the islander puzzle, we have
and thus, by the first-order epistemic inference rule,
In the islander interpretation, this asserts that if knows that one of the three islanders has blue eyes, but also knows that and do not have blue eyes, then must also know that he himself (or she herself) has blue eyes.
One particular consequence of the first-order epistemic inference rule is that if a sentence is a tautology in – true in every model of , or equivalently (by completeness) deducible from the inference rules of , and is a knowledge agent, then is a tautology in : implies . Thus, for instance, we have , because is a tautology in (thus ).
It is important to note, however, that if a statement is not a tautology, but merely true in the “real” world , this does not imply that is also true in the real world: as we shall see later, does not imply . (We will define what means presently.) Intuitively, this reflects the obvious fact that knowledge agents need not be omniscient; it is possible for a sentence to be true without a given agent being aware of this truth.
In the converse direction, we also allow for the possibility that is true in the real world, without being true in the real world, thus it is conceivable that is true but is false. This reflects the fact that a knowledge agent may in fact have incorrect knowledge of the real world. (This turns out not to be an important issue in the islander puzzle, but is of relevance for the unexpected hanging puzzle.)
In a related spirit, we also allow for the possibility that and may both be true in the real world; an agent may conceivably be able to know inconsistent facts. However, from the inference of ex falso quodlibet and the first-order epistemic inference rule, this would mean that is true in this world for every in , thus this knowledge agent believes absolutely every statement to be true. Again, such inconsistencies are not of major relevance to the islander puzzle, but as we shall see, their analysis is important for resolving the unexpected hanging puzzle correctly.
Remark 1 It is perhaps worth re-emphasising the previous points. In some interpretations of knowledge, means that has somehow been “justified” to be true, and in particular should entail in such interpretations. However, we are taking a more general (and abstract) point of view, in which we are agnostic as regards to whether represents necessary or justified knowledge. In particular, our analysis also applies to “generalised knowledge” operators, such as “belief”. One can of course specialise this general framework to a more specific knowledge concept by adding more axioms, in which case one can obtain sharper conclusions regarding the resolution of various paradoxes, but we will work here primarily in the general setting.
Having discussed the language and syntax of the first-order epistemic logic , we now turn to the semantics, in which we describe the possible models of . As is an extension of , any model of must contain as a component a model of , which describes the truth assignment of each of the atomic propositions of ; but it must also describe the state of knowledge of each of the agents in this logic. One can describe this state in two equivalent ways; either as a theory (in ) of all the sentences in that knows to be true (which, by the first-order epistemic inference rule, is closed under and is thus indeed a theory in ); or equivalently (by the soundness and completeness of ), as a set
of all the possible models of in which all the statements that knows to be true, are in fact true. We will adopt the latter perspective; thus a model of consists of a tuple
where is a model of , and for each , is a set of models of . To interpret sentences in , we then declare iff for each atomic sentence , and declare iff is true in every model in , for each and . All other sentences in are then interpreted by applying the usual truth tables.
As an example of such a model, consider a world with three islanders , each of which has blue eyes, and can each see that each other has blue eyes, but are each unaware of their own eye colour. In this model, assigns a true value to each of . As for , which describes the knowledge state of , this set consists of two possible -worlds. One is the “true” -world , in which are all true; but there is also an additional hypothetical -world , in which is true but is false. With ‘s current state of knowledge, neither of these two possibilities can be ruled out. Similarly, and will also consist of two -worlds, one of which is the “true” -world , and the other is not.
In this particular case, the true -world is included as a possible world in each of the knowledge agents’ set of possible worlds , but in situations in which the agents knowledge is incorrect or inconsistent, it can be possible for to not be an element of one or more of the .
Remark 2 One can view an model as consisting of the “real world” – the -model – together with clouds , of “hypothetical worlds”, one for each knowledge agent . If one chooses, one can “enter the head” of any one of these knowledge agents to see what he or she is thinking. One can then select any one of the -worlds in as a “possible world” in ‘s worldview, and explore that world further. Later on we will iterate this process, giving a tree-like structure to the higher order epistemic models.
Let be the set of all models of . This is quite a large set; if there are atomic statements and knowledge agents , then there are possibilities for the -world , and each knowledge agent has its own independent set of possible worlds, of which there are different possibilities, leading to distinct models for in all. For instance, with three islanders wondering about eye colours, this leads to possibilities (although, once everyone learns each other’s eye colour, the number of possible models goes down quite significantly).
It can be shown (but is somewhat tedious to do so) that the syntax and semantics of the first-order epistemic logic is still sound and complete, basically by mimicking (and using) the proof of the soundness and completeness of ; we sketch a proof of this below when we discuss higher order logics.
— 3. Higher-order epistemic logic —
We can iterate the above procedure and construct a language, syntax, and semantics for order epistemic logic generated by some atomic propositions and knowledge agents , recursively in terms of the preceding epistemic logic . More precisely, let be a natural number, and suppose that the logic has already been defined. We then define the language of as the extension of generated by the laws of propositional grammar and the following rule:
- If is a sentence in , and is a knowledge agent, then is a sentence in .
Thus, for instance, in the running example of three propositions and three knowledge agents ,
is a sentence in (and hence in , , etc.) but not in .
As for the syntax, we adopt all the inference rules of ordinary propositional logic, together with one new rule:
- -order epistemic inference rule: If are sentences such that
and is a knowledge agent, then
Thus, for instance, starting with
and so forth. Informally, this rule asserts that all agents are highly logical, that they know that all agents are highly logical, and so forth. A typical deduction from these inference rules, which is again of relevance to the islander puzzle, is
Remark 3 This is a very minimal epistemic syntax, and is weaker than some epistemic logics considered in the literature. For instance, we do not have any version of the positive introspection rule
thus we allow the possibility that an agent knows “subconsciously”, in that the agent knows but does not know that he or she knows . Similarly, we do not have any version of the negative introspection rule
so we allow the possibility that an agent is “unaware of his or her own ignorance”. One can of course add these additional rules ex post facto and see how this strengthens the syntax and limits the semantics, but we will not need to do so here.
There is also no reason to expect the knowledge operators to commute:
Now we turn to the semantics. A model of the language consists of a -model , together with sets of possible -models associated to their respective knowledge agents . To describe how models sentences, we declare iff , and for any sentence in and , we declare iff one has for every .
Example 1 We consider an islander model with atomic propositions (with each representing the claim that has blue eyes) and knowledge agents (with representing the knowledge state of at a fixed point in time). There are -models , determined by the truth values they assign to the atomic propositions . For each , we can then recursively associate a -model to each -model , by setting , and then for , setting to be the -model with -model , and with consisting of the pair , where (resp. ) is the -model which is identical to except that the truth value of is set to false (resp. true). Informally, models the -order epistemology of the -world , in which each islander sees each other’s eye colour (and knows that each other islander can see all other islander’s eye colour, and so forth for iterations), but is unsure as to his or her own eye colour (which is why the set of ‘s possible -worlds branches into two possibilities). As one recursively explores the clouds of hypothetical worlds in these models, one can move further and further away from the “real” world. Consider for instance the situation when and (thus in the “real” world, all three islanders have blue eyes), and . From the perspective of , it is possible that one is in the world , in which does not have blue eyes: . In that world, we can then pass to the perspective of , and then one could be in the world , in which neither nor have blue eyes: . Finally, inside this doubly nested hypothetical world, one can consider the perspective of , in which one could be in the world , in which none of have blue eyes: . This is the total opposite of the “real” model , but cannot be ruled out in at this triply nested level. In particular, we have
despite the fact that
for all . (In particular, the statement , which asserts “at least one islander has blue eyes”, is not common knowledge in .
We have the basic soundness and completeness properties:
Proposition 2 For each , is both sound and complete.
Proof: (Sketch) This is done by induction on . For , this is just the soundness and completeness of propositional logic. Now suppose inductively that and the claim has already been proven for . Soundness can be verified as in the propositional logic case (with the validity of the epistemic inference rule being justified by induction). For completeness, one again uses the trick of passing to a maximal -theory that contains one set of sentences in , but not another sentence . This maximal -theory uniquely determines an -model by inspecting whether each or its negation lies in the theory, and also determines -theories for each . By induction hypothesis, each of these theories can be identified with a collection of -models, thus creating a -model that obeys but not , giving (the contrapositive of) completeness.
— 4. Arbitrary order epistemic logic —
An easy induction shows that the order logic extends the previous logic , in the sense that every sentence in is a sentence in , every deduction on is also a deduction in , and every model of projects down (by “forgetting” some aspects of the model) to a model of . We can then form a limiting logic , whose language is the union of all the (thus, is a sentence in iff lies in for some ), whose deductive implications are the union of all the deductive implications (thus, if we have for some ), and whose models are the inverse limits of the models (thus, a model of is an infinite sequence of models of for each , such that each projects down to for . It is not difficult to see that the soundness and completeness of each of the implies the soundness and completeness of the limit (assuming the axiom of choice, of course, in our metamathematics).
The logic now allows one to talk about arbitrarily deeply nested strings of knowledge: if is a sentence in , and is a knowledge agent, then is also a sentence in . This allows for the following definition:
Definition 3 (Common knowledge) If is a sentence in , then is the set of all sentences of the form
where and are knowledge agents (possibly with repetition).
Thus, for instance, using the epistemic inference rules, every tautology in is commonly known as such: if , then .
Let us now work in the islander model in which there are atomic propositions and knowledge agents . To model the statement that “it is commonly known that each islander knows each other islander’s eye colour”, one can use the sets of sentences
for all distinct .
For any , let denote the sentence that there are at least blue-eyed islanders; this can be encoded as a suitable finite combination of the . For instance, can be expressed by any tautology, can be expressed by , can be expressed by , and intermediate can be expressed by more complicated formulae. Let denote the statement that there are exactly blue-eyed islanders; for instance, if , then can be expressed as
The following theorem asserts, roughly speaking, that if there are blue-eyed islanders, and it is commonly known that there are at least blue-eyed islanders, then all blue-eyed islanders can deduce their own eye colour if , but not otherwise.
(informally, asserts that all blue-eyed islanders know their own eye colour).
- If , then
- If , then
Proof: The first part of the theorem can be established informally as follows: if holds, then each blue-eyed islander sees other blue-eyed islanders, but also knows that there are at least blue-eyed islanders. If , this forces each blue-eyed islander to conclude that his or her own eyes are blue (and in fact if , the blue-eyed islander’s knowledge is now inconsistent, but the conclusion is still valid thanks to ex falso quodlibet). It is a routine matter to formalise this argument using the axioms (2), (3) and the epistemic inference rule; we leave the details as an exercise.
To prove the second part, it suffices (by soundness) to construct a -model which satisfies , , and but not . By definition of an -model, it thus suffices to construct, for all sufficiently large natural numbers , an -model which satisfies , , and , but not , and which are consistent with each other in the sense that each is the restriction of to .
We can do this by a modification of the construction in Example 1. For any -model , we can recursively define an -model for any by setting , and then for each , setting to be the -model with -model , and with possible worlds given by
this is the same construction as in Example 1, except that at all levels of the recursive construction, we restrict attention to worlds that obey . A routine induction shows that the determine a limit , which is an model that obeys and . If , then clearly as well. But if , then we see that , because for any index with , we see that if , then contains worlds in which is false, and so for any .
— 5. Temporal epistemic logic —
The epistemic logic discussed above is sufficiently powerful to model the knowledge environment of the islanders in the blue-eyed islander puzzle at a single instant in time, but in order to fully model the islander puzzle, we now must now incorporate the role of time. To avoid confusion, I feel that this is best accomplished by adopting a “spacetime” perspective, in which time is treated as another coordinate rather than having any particularly privileged role in the theory, and the model incorporates all time-slices of the system at once. In particular, if we allow the time parameter to vary along some set of times, then each actor in the model should now generate not just a single knowledge agent , but instead a family of knowledge agents, one for each time . Informally, should then model the assertion that “ knows at time “. This of course leads to many more knowledge agents than before; if for instance one considers an islander puzzle with islanders over distinct points in time, this would lead to distinct knowledge agents . And if the set of times is countably or uncountably infinite, then the number of knowledge agents would similarly be countably or uncountably infinite. Nevertheless, there is no difficulty extending the previous epistemic logics and to cover this situation. In particular we still have a complete and sound logical framework to work in.
Note that if we do so, we allow for the ability to nest knowledge operators at different times in the past or future. For instance, if we have three times , one could form a sentence such as
which informally asserts that at time , knows that already knew to be true by time , or
which informally asserts that at time , knows that will know to be true by time . The ability to know certain statements about the future is not too relevant for the blue-eyed islander puzzle, but is a crucial point in the unexpected hanging paradox.
Of course, with so many knowledge agents present, the models become more complicated; a model of now must contain inside it clouds of possible worlds for each actor and each time .
for any , any , and any . (This axiom is important for the blue-eyed islander puzzle, though it turns out not to be relevant for the unexpected hanging paradox.)
We can also define a notion of common knowledge at a single time : given a sentence , we let denote the set of sentences of the form
where and . This is a subset of , which is the set of all sentences of the form
where can vary arbitrarily in .
— 6. The blue-eyed islander puzzle —
Now we can model the blue-eyed islander puzzle. To simplify things a bit, we will work with a discrete set of times indexed by the integers, with being the day in which the foreigner speaks, and any other time being the time days after (or before, if is negative) the foreigner speaks. (One can also work with a continuous time with only minor changes.) Note the presence of negative time; this is to help resolve the question (which often comes up in discussion of this puzzle) as to whether the islanders would already have committed suicide even before the foreigner speaks.
Also, the way the problem is set up, we have the somewhat notationally annoying difficulty that once an islander commits suicide, it becomes meaningless to ask whether that islander continues to know anything or not. To resolve this problem, we will take the liberty of modifying the problem by replacing “suicide” with a non-lethal public ritual. (This means (thanks to (4)) that once an islander learns his or her own eye colour, he or she will be condemned to repeating this ritual “suicide” every day from that point.) It is possible to create a logic which tracks when different agents are alive or dead and to thus model the concept of suicide, but this is something of a distraction from the key point of the puzzle, so we will simply redefine away this issue.
For similar reasons, we will not concern ourselves with eye colours other than blue, and only consider suicides stemming from blue eyes, rather than from any non-blue colour. (It is intuitively obvious, and can eventually be proven, that the foreigner’s statement about the existence of blue-eyed islanders is insufficient information to allow any islander to distinguish between, say, green eyes and brown eyes, and so this statement cannot trigger the suicide of any non-blue-eyed person.)
As in previous sections, our logic will have the atomic propositions , with each expressing the statement that has blue eyes, as well as knowledge agents for each and . However, we will also need further atomic propositions for and , which denote the proposition that commits suicide (or a ritual equivalent) at time . Thus we now have a countably infinite number of atomic propositions and a countably infinite number of knowledge agents, but there is little difficulty extending the logics and to cover this setting.
We can now set up the various axioms for the puzzle. The “highly logical” axiom has already been subsumed in the epistemological inference rule. We also impose the memory axiom (4). Now we formalise the other assumptions of the puzzle:
- (All islanders see each other’s eye colour) If are distinct and , then
- (Anyone who learns their own eye colour is blue, must commit suicide the next day) If and , then
- (Suicides are public) For any , , and , we have
Similarly, if , then
- (Foreigner announces in public on day 0 that there is at least one blue-eyed islander) We have
Theorem 5 Let .
- (At least one blue-eyed islander commits suicide by day )
- (Nobody needs to commit suicide before day ) For any and ,
The first conclusion is weaker than the “Solution 2″ of the puzzle, which would assert in fact that all blue-eyed islanders will commit suicide on day . While this indeed the “default” outcome of the hypotheses , it turns out that this is not the only possible outcome; for instance, if one blue-eyed person happens to commit suicide on day or day (perhaps for an unrelated reason than learning his or her own eye colour), then it turns out that this “cancels” the effect of the foreigner’s announcement, and prevents further suicides. (So, strictly speaking, Solution 2 is not quite accurate, though it is much closer to the truth than Solution 1 is.)
In fact there is a strengthening of the first conclusion: given the hypotheses , there must exist a time and distinct islanders such that holds for all .
Note that the second conclusion does not prohibit the existence of some models of in which suicides occur before day (consider for instance a situation in which a second foreigner made a similar announcement a few days before the first one, causing the chain of events to start at an earlier point and leading to earlier suicides).
Proof: (Sketch) To illustrate the first part of the theorem, we focus on the simple case ; the general case is similar but requires more notation (and an inductive argument). It suffices to establish that
(i.e. if nobody suicides by day , then both islanders will suicide on day .)
Assume . From (10) we have
and hence by (4)
By (6) we also have
whereas from the epistemic inference axioms we have
From the epistemic inference axioms again, we conclude that
and hence by (7) (and epistemic inference)
On the other hand, from and (9) we have
and hence by epistemic inference
and thus by (7)
A similar argument gives , and the claim follows.
To prove the second part, one has to construct, for each , an -model in which is true and is false for any and . This is remarkably difficult, in large part due to the ability of nested knowledge operators to jump backwards and forwards in time. In particular, one can jump backwards to before Day 0, and so one must first model worlds in which there is no foreigner announcement. We do this as follows. Given an -model , we recursively define a -model for as follows. Firstly, . Next, if and has already been defined, we define to be the -model with -model , and for any and , setting to be the set of all -models of the form , where is an -model obeying the following properties:
- ( sees other islanders’ eyes) If and , then iff .
- ( remembers suicides) If and , then iff .
Now we model worlds in which there is a foreigner announcement. Define an admissible model to be an -model such that there exist for which the following hold:
- (i.e. there are exactly blue-eyed islanders in the world ).
- There exists distinct such that and for all .
- For any and , implies .
We call the blue-eyed count of .
(Such models, incidentally, can already be used to show that no suicides necessarily occur in the absence of the foreigner’s announcement, because the limit of such models always obey all the axioms of except for (10).)
Given an admissible -model of some blue-eyed count , we recursively define an model for by setting , then if and has already been defined, we define to be the -model with -model , and with for and defined by the following rules:
Case 1. If , then we set to be the set of all -models of the form , where obeys the two properties “ sees other islanders’ eyes” and “ remembers suicides” from the preceding construction. ( does not need to be admissible in this case.)
Case 2. If , , and there does not exist distinct such that for all , then we set -models of the form , where is admisssible, obeys the two properties “ sees other islanders’ eyes” and “ remembers suicides” from the preceding construction, and also obeys the additional property . (Informally, this is the case in which “must learn” .)
Case 3. In all other cases, we set to be the set of all -models of the form , where is admissible and obeys the two properties “ sees other islanders’ eyes” and “ remembers suicides” from the preceding construction.
We let be the limit of the (which can easily be verified to exist by induction). A quite tedious verification reveals that for any admissible -model of blue-eyed count , that obeys both and , but one can choose to not admit any suicides before time , which will give the second claim of the theorem.
Remark 4 Under the assumptions used in our analysis, we have shown that it is inevitable that the foreigner’s comment will cause at least one death. However, it is possible to avert all deaths by breaking one or more of the assumptions used. For instance, if it is possible to sow enough doubt in the islanders’ minds about the logical and devout nature of the other islanders, then one can cause a breakdown of the epistemic inference rule or of (7), and this can prevent the chain of deductions from reaching its otherwise deadly conclusion.
— 7. The unexpected hanging paradox —
It turns out that there are several, not quite equivalent, ways to model this “paradox” epistemologically, with the differences hinging on how one interprets what “unexpected” or “surprise” means. In particular, if is a sentence and is a knowledge agent, how would one model the sentence “ does not expect ” or “ is surprised by “?
i.e. as the assertion that does not know to be true. However, this leads to the following situation: if has inconsistent knowledge (in particular, one has , where represents falsity (the negation of a tautology)), then by ex falso quodlibet, would be true for every , and hence would expect everything and be surprised by nothing. An alternative interpretation, then, is to adopt the convention that an agent with inconsistent knowledge is so confused as to not be capable of expecting anything (and thus be surprised by everything). In this case, “ does not expect ” should instead be modeled as
i.e. that either does not know to be true, or is inconsistent.
Let now analyse the unexpected hanging paradox using the former interpretation (11) of surprise. We begin with the simplest (and somewhat degenerate) situation, in which there is only one time (say Monday at noon) in which the hanging is to take place. In this case, there is just one knowledge agent (the knowledge of the prisoner after the judge speaks, but before the executation date of Monday at noon). We introduce an atomic sentence , representing the assertion that the prisoner will be hanged on Monday at noon. In this case (and using the former interpretation (11) of surprise), the judge’s remarks can be modeled by the sentence
The “paradox” in this case stems from the following curious fact:
- There exist -models in which is true.
- There exist -models in which is true.
- However, there does not exist any -model in which both and is true.
Thus, the judge’s statement can be true, but if so, it is not possible for the prisoner to know this! (In this regard, the sentence is analogous to a Godel sentences, which can be true in models of a formal system, but not provable in that system.) More informally: knowing a surprise, ruins that surprise.
Proof: The third statement is easy enough to establish: if is true in some model, then clearly is true in that model; but if is true in the same model, then (by epistemic inference) will be true as well, which is a contradiction.
The first statement is also fairly easy to establish. We have two -models; a model in which is true, and a model in which is false. We can recursively define the -model for any and any by setting , and for , setting to be the -model with -model , and with . One then easily verifies that the have a limit , and that models (but not , of course).
A trivial way to establish the second statement is to make a model in which is inconsistent (thus is empty). One can also take to be , and this will also work. (Of course, in such models, must be false.)
Another peculiarity of the sentence is that
as can be easily verified (by modifying the proof of the second statement of the above theorem). Thus, the sentence has the property that if the prisoner believes , and also knows that he or she believes , then the prisoner’s beliefs automatically become inconsistent – despite the fact that is not actually a self-contradictory statement (unless also combined with ).
Now we move to the case when the execution could take place at two possible times, say Monday at noon and Tuesday at noon. We then have two atomic statements: , the assertion that the execution takes place on Monday at noon, and , the assertion that the execution takes place on Tuesday at noon. There are two knowledge agents; , the state of knowledge just before Monday at noon, and , the state of knowledge just before Tuesday at noon. (There is again the annoying notational issue that if occurs, then presumably the prisoner will have no sensible state of knowledge by Tuesday, and so might not be well defined in that case; to avoid this irrelevant technicality, we replace the execution by some non-lethal punishment (or use an alternate formulation of the puzzle, such as the surprise exam formulation).)
Thus, it is common knowledge that if the execution does not happen on Monday, then by Tuesday, the prisoner will be aware of this fact. This axiom should of course be completely non-controversial.
The judge’s sentence, in this case, is given by
Analogously to Theorem 6, we can find models obeying (14) in which is true, but one cannot find models obeying (14) in which , , , and are all true, as one can soon see that this leads to a contradiction. Indeed, from one has
while from (14) one has
and from one has
wihch shows that leads to a contradiction, which implies and hence by . On the other hand, from one has
while from (14) one has
and from one has
which shows that , and thus , a contradiction. So, as before, is a secret which can only be true as long as it is not too widely known.
A slight variant of the above argument shows that if , , and hold, then and hold – or informally, the prisoner can deduce using knowledge of (and knowledge of knowledge of ) that there will be no execution on either date. This may appear at first glance to be consistent with (which asserts that the prisoner will be surprised when the execution does happen), but this is a confusion between (11) and (13). Indeed, one can show under the assumptions that is inconsistent, and (if holds) then is also inconsistent, and so and do not, in fact, imply .
In this case it is possible for and to be true, and in fact for to be common knowledge, basically by making inconsistent. (A little more precisely: we use the -model where and . Informally: the judge has kept the execution a surprise by driving the prisoner insane with contradictions.
The situation is more interesting in the two-day setting (as first pointed out by Kritchman and Raz), where is now
Here it is possible for to in fact be common knowledge in some -model, but in order for this to happen, at least one of the following three statements must be true in this model:
(We leave this as an exercise for the interested reader.) In other words, in order for the judge’s sentence to be common knowledge, either the prisoner’s knowledge on Monday or Tuesday needs to be inconsistent, or else the prisoner’s knowledge is consistent, but the prisoner is unable (on Monday) to determine that his or her own knowledge (on Tuesday) is consistent. Notice that the third conclusion here is very reminiscent of Gödel’s second incompleteness theorem, and indeed in the above-mentioned paper of Kritchman and Raz, the surprise examination argument is modified to give a rigorous proof of that theorem.
Remark 5 Here is an explicit example of a -world in which is common knowledge, and and are both consistent (but does not know that is consistent). We first define -models for each recursively by setting to be the world in which and , and then define for to be the -model with -model , with , and . (Informally: the execution is on Tuesday, and the prisoner knows this on Monday, but has become insane by Tuesday.) We then define the models for recursively by setting to be the world in which and , then define for to be the -model with -model , , and . (Informally: the execution is on Monday, but the prisoner only finds this out after the fact.) The limit of the then has as common knowledge, with and both false, but is also false.