The famous Gödel completeness theorem in logic (not to be confused with the even more famous Gödel incompleteness theorem) roughly states the following:

Theorem 1 (Gödel completeness theorem, informal statement) Let {\Gamma} be a first-order theory (a formal language {{\mathcal L}}, together with a set of axioms, i.e. sentences assumed to be true), and let {\phi} be a sentence in the formal language. Assume also that the language {{\mathcal L}} has at most countably many symbols. Then the following are equivalent:

  • (i) (Syntactic consequence) {\phi} can be deduced from the axioms in {\Gamma} by a finite number of applications of the laws of deduction in first order logic. (This property is abbreviated as {\Gamma \vdash \phi}.)
  • (ii) (Semantic consequence) Every structure {{\mathfrak U}} which satisfies or models {\Gamma}, also satisfies {\phi}. (This property is abbreviated as {\Gamma \models \phi}.)
  • (iii) (Semantic consequence for at most countable models) Every structure {{\mathfrak U}} which is at most countable, and which models {\Gamma}, also satisfies {\phi}.

One can also formulate versions of the completeness theorem for languages with uncountably many symbols, but I will not do so here. One can also force other cardinalities on the model {{\mathfrak U}} by using the Löwenheim-Skolem theorem.

To state this theorem even more informally, any (first-order) result which is true in all models of a theory, must be logically deducible from that theory, and vice versa. (For instance, any result which is true for all groups, must be deducible from the group axioms; any result which is true for all systems obeying Peano arithmetic, must be deducible from the Peano axioms; and so forth.) In fact, it suffices to check countable and finite models only; for instance, any first-order statement which is true for all finite or countable groups, is in fact true for all groups! Informally, a first-order language with only countably many symbols cannot “detect” whether a given structure is countably or uncountably infinite. Thus for instance even the ZFC axioms of set theory must have some at most countable model, even though one can use ZFC to prove the existence of uncountable sets; this is known as Skolem’s paradox. (To resolve the paradox, one needs to carefully distinguish between an object in a set theory being “externally” countable in the structure that models that theory, and being “internally” countable within that theory.)

Of course, a theory {\Gamma} may contain undecidable statements {\phi} – sentences which are neither provable nor disprovable in the theory. By the completeness theorem, this is equivalent to saying that {\phi} is satisfied by some models of {\Gamma} but not by other models. Thus the completeness theorem is compatible with the incompleteness theorem: recursively enumerable theories such as Peano arithmetic are modeled by the natural numbers {{\mathbb N}}, but are also modeled by other structures also, and there are sentences satisfied by {{\mathbb N}} which are not satisfied by other models of Peano arithmetic, and are thus undecidable within that arithmetic.

An important corollary of the completeness theorem is the compactness theorem:

Corollary 2 (Compactness theorem, informal statement) Let {\Gamma} be a first-order theory whose language has at most countably many symbols. Then the following are equivalent:

  • (i) {\Gamma} is consistent, i.e. it is not possible to logically deduce a contradiction from the axioms in {\Gamma}.
  • (ii) {\Gamma} is satisfiable, i.e. there exists a structure {{\mathfrak U}} that models {\Gamma}.
  • (iii) There exists a structure {{\mathfrak U}} which is at most countable, that models {\Gamma}.
  • (iv) Every finite subset {\Gamma'} of {\Gamma} is consistent.
  • (v) Every finite subset {\Gamma'} of {\Gamma} is satisfiable.
  • (vi) Every finite subset {\Gamma'} of {\Gamma} is satisfiable with an at most countable model.

Indeed, the equivalence of (i)-(iii), or (iv)-(vi), follows directly from the completeness theorem, while the equivalence of (i) and (iv) follows from the fact that any logical deduction has finite length and so can involve at most finitely many of the axioms in {\Gamma}. (Again, the theorem can be generalised to uncountable languages, but the models become uncountable also.)

There is a consequence of the compactness theorem which more closely resembles the sequential concept of compactness. Given a sequence {{\mathfrak U}_1, {\mathfrak U}_2, \ldots} of structures for {{\mathcal L}}, and another structure {{\mathfrak U}} for {{\mathcal L}}, let us say that {{\mathfrak U}_n} converges elementarily to {{\mathfrak U}} if every sentence {\phi} which is satisfied by {{\mathfrak U}}, is also satisfied by {{\mathfrak U}_n} for sufficiently large {n}. (Replacing {\phi} by its negation {\neg \phi}, we also see that every sentence that is not satisfied by {{\mathfrak U}}, is not satisfied by {{\mathfrak U}_n} for sufficiently large {n}.) Note that the limit {{\mathfrak U}} is only unique up to elementary equivalence. Clearly, if each of the {{\mathfrak U}_n} models some theory {\Gamma}, then the limit {{\mathfrak U}} will also; thus for instance the elementary limit of a sequence of groups is still a group, the elementary limit of a sequence of rings is still a ring, etc.

Corollary 3 (Sequential compactness theorem) Let {{\mathcal L}} be a language with at most countably many symbols, and let {{\mathfrak U}_1, {\mathfrak U}_2, \ldots} be a sequence of structures for {{\mathcal L}}. Then there exists a subsequence {{\mathfrak U}_{n_j}} which converges elementarily to a limit {{\mathfrak U}} which is at most countable.

Proof: For each structure {{\mathfrak U}_n}, let {\hbox{Th}({\mathfrak U}_n)} be the theory of that structure, i.e. the set of all sentences that are satisfied by that structure. One can view that theory as a point in {\{0,1\}^{{\mathcal S}}}, where {{\mathcal S}} is the set of all sentences in the language {{\mathcal L}}. Since {{\mathcal L}} has at most countably many symbols, {{\mathcal S}} is at most countable, and so (by the sequential Tychonoff theorem) {\{0,1\}^{{\mathcal S}}} is sequentially compact in the product topology. (This can also be seen directly by the usual Arzelá-Ascoli diagonalisation argument.) Thus we can find a subsequence {\hbox{Th}({\mathfrak U}_{n_j})} which converges in the product topology to a limit theory {\Gamma \in \{0,1\}^{{\mathcal S}}}, thus every sentence in {\Gamma} is satisfied by {{\mathfrak U}_{n_j}} for sufficiently large {j} (and every sentence not in {\Gamma} is not satisfied by {{\mathfrak U}_{n_j}} for sufficiently large {j}). In particular, any finite subset of {\Gamma} is satisfiable, hence consistent; by the compactness theorem, {\Gamma} itself is therefore consistent, and has an at most countable model {{\mathfrak U}}. Also, each of the theories {\hbox{Th}({\mathfrak U}_{n_j})} is clearly complete (given any sentence {\phi}, either {\phi} or {\neg \phi} is in the theory), and so {\Gamma} is complete as well. One concludes that {\Gamma} is the theory of {{\mathfrak U}}, and hence {{\mathfrak U}} is the elementary limit of the {{\mathfrak U}_{n_j}} as claimed. \Box

[It is also possible to state the compactness theorem using the topological notion of compactness, as follows: let {X} be the space of all structures of a given language {{\mathcal L}}, quotiented by elementary equivalence. One can define a topology on {X} by taking the sets {\{ {\mathfrak U} \in X: {\mathfrak U} \models \phi \}} as a sub-base, where {\phi} ranges over all sentences. Then the compactness theorem is equivalent to the assertion that {X} is topologically compact.]

One can use the sequential compactness theorem to build a number of interesting “non-standard” models to various theories. For instance, consider the language {{\mathcal L}} used by Peano arithmetic (which contains the operations {+, \times} and the successor operation {S}, the relation {=}, and the constant {0}), and adjoint a new constant {N} to create an expanded language {{\mathcal L} \cup \{N\}}. For each natural number {n \in {\Bbb N}}, let {{\Bbb N}_n} be a structure for {{\mathcal L} \cup \{N\}} which consists of the natural numbers {{\Bbb N}} (with the usual interpretations of {+}, {\times}, etc.) and interprets the symbol {N} as the natural number {n}. By the compactness theorem, some subsequence of {{\Bbb N}_n} must converge elementarily to a new structure {*{\Bbb N}} of {{\mathcal L} \cup \{N\}}, which still models Peano arithmetic, but now has the additional property that {N>n} for every (standard) natural number {n}; thus we have managed to create a non-standard model of Peano arithmetic which contains a non-standardly large number (one which is larger than every standard natural number).

The sequential compactness theorem also lets us construct infinitary limits of various sequences of finitary objects; for instance, one can construct infinite pseudo-finite fields as the elementary limits of sequences of finite fields. I recently discovered that other several correspondence principles between finitary and infinitary objects, such as the Furstenberg correspondence principle between sets of integers and dynamical systems, or the more recent correspondence principles concerning graph limits, can be viewed as special cases of the sequential compactness theorem; it also seems possible to encode much of the sum-product theory in finite fields in an infinitary setting using this theorem. I hope to discuss these points in more detail in a later post. In this post, I wish to review (partly for my own benefit) the proof of the completeness (and hence compactness) theorem. The material here is quite standard (I basically follow the usual proof of Henkin, and taking advantage of Skolemisation), but perhaps the concept of an elementary limit is not as well-known outside of logic as it might be. (The closely related concept of an ultraproduct is better known, and can be used to prove most of the compactness theorem already, thanks to Los’s theorem, but I do not know how to use ultraproducts to ensure that the limiting model is countable. However, one can think (intuitively, at least), of the limit model {{\mathfrak U}} in the above theorem as being the set of “constructible” elements of an ultraproduct of the {{\mathfrak U}_n}.)

In order to emphasise the main ideas in the proof, I will gloss over some of the more technical details in the proofs, relying instead on informal arguments and examples at various points.

— 1. Completeness and compactness in propositional logic —

The completeness and compactness theorems are results in first-order logic. But to motivate some of the ideas in proving these theorems, let us first consider the simpler case of propositional logic. The language {{\mathcal L}} of a propositional logic consists of the following:

Of course, we assume that the symbols used for atomic formulae are distinct from those used for logical connectives, or for parentheses; we will implicitly make similar assumptions of this type in later sections without further comment.

Using this language, one can form sentences (or formulae) by some standard formal rules which I will not write down here. Typical examples of sentences in propositional logic are {A_1 \implies (A_2 \vee A_3)}, {(A_1 \wedge \neg A_1) \implies A_2}, and {(A_1 \wedge A_2) \vee (A_1 \wedge A_3)}. Each sentence is of finite length, and thus involves at most finitely many of the propositional variables. Observe that if {{\mathcal L}} is at most countable, then there are at most countably many sentences.

The analogue of a structure in propositional logic is a truth assignment. A truth assignment {{\mathfrak U}} for a propositional language {{\mathcal L}} consists of a truth value {A_n^{\mathfrak U} \in \{ \hbox{true}, \hbox{false} \}} assigned to each propositional variable {A_n}. (Thus, for instance, if there are {N} propositional variables in the language, then there are {2^N} possible truth assignments.) Once a truth assignment {{\mathfrak U}} has assigned a truth value {A_n^{\mathfrak U}} to each propositional variable {A_n}, it can then assign a truth value {\phi^{\mathfrak U}} to any other sentence {\phi} in the language {{\mathcal L}} by using the usual truth tables for conjunction, negation, etc.; we write {{\mathfrak U} \models \phi} if {{\mathfrak U}} assigns a true value to {\phi} (and say that {\phi} is satisfied by {{\mathfrak U}}), and {{\mathfrak U} \not \models \phi} otherwise. Thus, for instance, if {A_1^{\mathfrak U} = \hbox{false}} and {A_2^{\mathfrak U} = \hbox{true}}, then {{\mathfrak U} \models A_1 \vee A_2} and {{\mathfrak U} \models A_1 \implies A_2}, but {{\mathfrak U} \not \models A_2 \implies A_1}. Some sentences, e.g. {A_1 \vee \neg A_1}, are true in every truth assignment; these are the (semantic) tautologies. At the other extreme, the negation of a tautology will of course be false in every truth assignment.

A theory {\Gamma} is a language {{\mathcal L}}, together with a (finite or infinite) collection of sentences (also called {\Gamma}) in that language. A truth assignment {{\mathfrak U}} satisfies (or models) the theory {\Gamma}, and we write {{\mathfrak U} \models \Gamma}, if we have {{\mathfrak U} \models \phi} for all {\phi \in \Gamma}. Thus, for instance, if {{\mathfrak U}} is as in the preceding example and {\Gamma := \{ \neg A_1, A_1 \implies A_2 \}}, then {{\mathfrak U} \models \Gamma}.

The analogue of the Gödel completeness theorem is then

Theorem 4 (Completeness theorem for propositional logic) Let {\Gamma} be a theory for a propositional language {{\mathcal L}}, and let {\phi} be a sentence in {{\mathcal L}}. Then the following are equivalent:

  • (i) (Syntactic consequence) {\phi} can be deduced from the axioms in {\Gamma} by a finite number of applications of the laws of propositional logic.
  • (ii) (Semantic consequence) Every truth assignment {{\mathfrak U}} which satisfies (or models) {\Gamma}, also satisfies {\phi}.

One can list a complete set of laws of propositional logic used in (i), but we will not do so here.

To prove the completeness theorem, it suffices to show the following equivalent version.

Theorem 5 (Completeness theorem for propositional logic, again) Let {\Gamma} be a theory for a propositional language {{\mathcal L}}. Then the following are equivalent:

  • (i) {\Gamma} is consistent, i.e. it is not possible to logically deduce a contradiction from the axioms in {\Gamma}.
  • (ii) {\Gamma} is satisfiable, i.e. there exists a truth assignment {{\mathfrak U}} that models {\Gamma}.

Indeed, Theorem 4 follows from Theorem 5 by applying Theorem 5 to the theory {\Gamma \cup \{ \neg \phi \}} and taking contrapositives.

It remains to prove Theorem 5. It is easy to deduce (i) from (ii), because the laws of propositional logic are sound: given any truth assignment, it is easy to verify that these laws can only produce true conclusions given true hypotheses. The more interesting implication is to obtain (ii) from (i) – given a consistent theory {\Gamma}, one needs to produce a truth assignment that models that theory.

Let’s first consider the case when the propositional language {{\mathcal L}} is finite, so that there are only finitely many propositional variables {A_1,\ldots,A_N}. Then we can argue using the following “greedy algorithm”.

  • We begin with a consistent theory {\Gamma}.
  • Observe that at least one of {\Gamma \cup \{ A_1 \}} or {\Gamma \cup \{ \neg A_1 \}} must be consistent. For if both {\Gamma \cup \{ A_1 \}} and {\Gamma \cup \{ \neg A_1 \}} led to a logical contradiction, then by the laws of logic one can show that {\Gamma} must also lead to a logical contradiction.
  • If {\Gamma \cup \{A_1\}} is consistent, we set {A_1^{\mathfrak U} := \hbox{true}} and {\Gamma_1 := \Gamma \cup \{A_1\}}; otherwise, we set {A_1^{\mathfrak U} := \hbox{false}} and {\Gamma_1 := \Gamma \cup \{\neg A_1\}}.
  • {\Gamma_1} is consistent, so arguing as before we know that at least one of {\Gamma_1 \cup \{A_2\}} or {\Gamma_1 \cup \{\neg A_2\}} must be consistent. If the former is consistent, we set {A_2^{\mathfrak U} := \hbox{true}} and {\Gamma_2 := \Gamma_1 \cup \{A_2\}}; otherwise set {A_2^{\mathfrak U} := \hbox{false}} and {\Gamma_2 := \Gamma_1 \cup \{\neg A_2\}}.
  • We continue in this fashion, eventually ending up with a consistent theory {\Gamma_N} containing {\Gamma}, and a complete truth assignment {{\mathfrak U}} such that {A_n \in \Gamma_N} whenever {1 \leq n \leq N} is such that {A_n^{\mathfrak U} = \hbox{true}}, and such that {\neg A_n \in \Gamma_N} whenever {1 \leq n \leq N} is such that {A_n^{\mathfrak U} = \hbox{false}}.
  • From the laws of logic and an induction argument, one then sees that if {\phi} is any sentence with {\phi^{\mathfrak U} = \hbox{true}}, then {\phi} is a logical consequence of {\Gamma_N}, and hence (since {\Gamma_N} is consistent) {\neg \phi} is not a consequence of {\Gamma_N}. Taking contrapositives, we see that {\phi^{\mathfrak U} = \hbox{false}} whenever {\neg \phi} is a consequence of {\Gamma_N}; replacing {\phi} by {\neg \phi} we conclude that {{\mathfrak U}} satisfies every sentence in {\Gamma_N}, and the claim follows.

Remark 1 The above argument shows in particular that any finite theory either has a model or a proof of a contradictory statement (such as {A \wedge \neg A}). Actually producing a model if it exists, though, is essentially the infamous satisfiability problem, which is known to be NP-complete, and thus (if {P \neq NP}) would require super-polynomial time to execute.

The case of an infinite language can be obtained by combining the above argument with Zorn’s lemma (or transfinite induction and the axiom of choice, if the set of propositional variables happens to be well-ordered). Alternatively, one can proceed by establishing

Theorem 6 (Compactness theorem for propositional logic) Let {\Gamma} be a theory for a propositional language {{\mathcal L}}. Then the following are equivalent:

  • (i) {\Gamma} is satisfiable.
  • (ii) Every finite subset {\Gamma'} of {\Gamma} is satisfiable.

It is easy to see that Theorem 6 will allow us to use the finite case of Theorem 5 to deduce the infinite case, so it remains to prove Theorem 6. The implication of (ii) from (i) is trivial; the interesting implication is the converse.

Observe that there is a one-to-one correspondence between truth assignments {{\mathfrak U}} and elements of the product space {\{0,1\}^{\mathcal A}}, where {{\mathcal A}} is the set of propositional variables. For every sentence {\phi}, let {F_\phi \subset \{0,1\}^{\mathcal A}} be the collection of all truth assignments that satisfy {\phi}; observe that this is a closed (and open) subset of {\{0,1\}^{\mathcal A}} in the product topology (basically because {\phi} only involves finitely many of the propositional variables). If every finite subset {\Gamma'} of {\Gamma} is satisfiable, then {\bigcap_{\phi \in \Gamma'} F_\phi} is non-empty; thus the family {(F_\phi)_{\phi \in \Gamma}} of closed sets enjoys the finite intersection property. On the other hand, from Tychonoff’s theorem, {\{0,1\}^{\mathcal A}} is compact. We conclude that {\bigcap_{\phi \in \Gamma} F_\phi} is non-empty, and the claim follows.

Remark 2 While Tychonoff’s theorem in full generality is equivalent to the axiom of choice, it is possible to prove the compactness theorem using a weaker version of this axiom, namely the ultrafilter lemma. In fact, the compactness theorem is logically equivalent to this lemma.

— 2. Zeroth-order logic —

Propositional logic is far too limited a language to do much mathematics. Let’s make the language a bit more expressive, by adding constants, operations, relations, and (optionally) the equals sign; however, we refrain at this stage from adding variables or quantifiers, making this a zeroth-order logic rather than a first-order one.

A zeroth-order language {{\mathcal L}} consists of the following objects:

  • A (finite or infinite) collection {A_1, A_2, A_3, \ldots} of propositional variables;
  • A collection {R_1, R_2, R_3, \ldots} of relations (or predicates), with each {R_i} having an arity (or valence) {a[R_i]} (e.g. unary relation, binary relation, etc.);
  • A collection {c_1, c_2, c_3, \ldots} of constants;
  • A collection {f_1, f_2, f_3, \ldots} of operators (or functions), with each operator {f_i} having an arity {a[f_i]} (e.g. unary operator, binary operator, etc.);
  • Logical connectives;
  • Parentheses;
  • Optionally, the equals sign {=}.

For instance, a zeroth-order language for arithmetic on the natural numbers might include the constants {0, 1, 2, \ldots}, the binary relations {<, \leq , >, \geq}, the binary operations {+, \times}, the unary successor operation {S}, and the equals sign {=}. A zeroth-order language for studying all groups generated by six elements might include six generators {a_1,\ldots,a_6} and the identity element {e} as constants, as well as the binary operation {\cdot} of group multiplication and the unary operation {()^{-1}} of group inversion, together with the equals sign {=}. And so forth.

Note that one could shorten the description of such languages by viewing propositional variables as relations of arity zero, and similarly viewing constants as operators of arity zero, but I find it conceptually clearer to leave these two operations separate, at least initially. As we shall see shortly, one can also essentially eliminate the privileged role of the equals sign {=} by treating it as just another binary relation, which happens to have some standard axioms attached to it.

By combining constants and operators together in the usual fashion one can create terms; for instance, in the zeroth-order language for arithmetic, {3+(4 \times 5)} is a term. By inserting terms into a predicate or relation (or the equals sign {=}), or using a propositional variable, one obtains an atomic formula; thus for instance {3+(4 \times 5) > 25} is an atomic formula. By combining atomic formulae using logical connectives, one obtains a sentence (or formula); thus for instance {((4 \times 5) > 22) \implies (3 + (4\times 5) > 25)} is a sentence.

In order to assign meaning to sentences, we need the notion of a structure {{\mathfrak U}} for a zeroth-order language {{\mathcal L}}. A structure consists of the following objects:

  • A domain of discourse (or universe of discourse) {\hbox{Dom}({\mathfrak U})};
  • An assignment of a value {c_n^{\mathfrak U} \in \hbox{Dom}({\mathfrak U})} to every constant {c_n};
  • An assignment of a function {f_n^{\mathfrak U}: \hbox{Dom}({\mathfrak U})^{a[f_n]} \rightarrow \hbox{Dom}({\mathfrak U})} to every operation {f_n};
  • An assignment of a truth value {A_n^{\mathfrak U} \in \{ \hbox{true}, \hbox{false}\}} to every propositional variable {A_n};
  • An assignment of a function {R_n^{\mathfrak U}: \hbox{Dom}({\mathfrak U})^{a[R_n]} \{ \hbox{true}, \hbox{false}\}} to every relation {R_n}.

For instance, if {{\mathcal L}} is the language of groups with six generators discussed above, then a structure {\mathfrak U} would consist of a set {G = \hbox{Dom}({\mathfrak U})}, seven elements {a_1^{\mathfrak U},\ldots,a_6^{\mathfrak U}, e^{\mathfrak U} \in G} in that set, a binary operation {\cdot^{\mathfrak U}: G \times G \rightarrow G}, and a unary operation {(()^{-1})^{\mathfrak U}: G \rightarrow G}. At present, no group-type properties are assumed on these operations; the structure here is little more than a magma at this point.

Every sentence {\phi} in a zeroth-order language {{\mathcal L}} can be interpreted in a structure {{\mathfrak U}} for that language to give a truth value {\phi^{\mathfrak U} \in \{ \hbox{true}, \hbox{false}\}}, simply by substituting all symbols {\alpha} in the language with their interpreted counterparts {\alpha^{\mathfrak U}} (note that the equals sign {=} does not need any additional data in order to be interpreted). For instance, the sentence {a_1 \cdot a_2 = a_3} is true in {{\mathfrak U}} if {a_1^{\mathfrak U} \cdot^{\mathfrak U} a_2^{\mathfrak U} = a_3^{\mathfrak U}}. Similarly, every term {t} in the language can be interpreted to give a value {t^{\mathfrak U}} in the domain of discourse {\hbox{Dom}({\mathfrak U})}.

As before, a theory is a collection of sentences; we can define satisfiability {{\mathfrak U} \models \phi}, {{\mathfrak U} \models \Gamma} of a sentence {\phi} or a theory {\Gamma} by a structure {{\mathfrak U}} just as in the previous section. For instance, to describe groups with at most six generators in the language {{\mathcal L}}, one might use the theory {\Gamma} which consists of all the group axioms, specialised to terms, e.g. {\Gamma} would contain the associativity axioms {t_1 \cdot (t_2 \cdot t_3) = (t_1 \cdot t_2) \cdot t_3} for all choices of terms {t_1, t_2, t_3}. (Note that this theory is not quite strong enough to capture the concept of a structure {{\mathfrak U}} being a group generated by six elements, because the domain of {{\mathfrak U}} may contain some “inaccessible” elements which are not the interpretation of any term in {{\mathcal L}}, but without the universal quantifier, there is not much we can do in zeroth-order logic to say anything about those elements, and so this is pretty much the best we can do with this limited logic.)

Now we can state the completeness theorem:

Theorem 7 (Completeness theorem for zeroth-order logic) Let {\Gamma} be a theory for a zeroth-order language {{\mathcal L}}, and let {\phi} be a sentence in {{\mathcal L}}. Then the following are equivalent:

  • (i) (Syntactic consequence) {\phi} can be deduced from the axioms in {\Gamma} by a finite number of applications of the laws of zeroth-order logic (i.e. all the laws of first-order logic that do not involve variables or quantifiers).
  • (ii) (Semantic consequence) Every truth assignment {{\mathfrak U}} which satisfies (or models) {\Gamma}, also satisfies {\phi}.

To prove this theorem, it suffices as before to show that every consistent theory {\Gamma} in a zeroth-order logic is satisfiable, and conversely. The converse implication is again straightforward (the laws of zeroth-order logic are easily seen to be sound); the main task is to show the forward direction, i.e.

Proposition 8 Let {\Gamma} be a consistent zeroth-order theory. Then {\Gamma} has at least one model.

Proof: It is convenient to begin by eliminating the equality symbol from the language. Suppose we have already proven Proposition 8 has already been shown for languages without the equality symbol. Then we claim that the proposition also holds for languages with the equality symbol. Indeed, given a consistent theory {\Gamma} in a language {{\mathcal L}} with equality, we can form a companion theory {\Gamma'} in the language {{\mathcal L}'} formed by removing the equality symbol from {{\mathcal L}} and replacing it with a new binary relation {='}, by taking all the sentences in {\Gamma} and replacing {=} by {='}, and then adding in all the axioms of equality (with {=} replaced by {='}) to {\Gamma'}. Thus, for instance, one would add the transitivity axioms {(x =' y) \wedge (y =' z) \implies (x =' z)} to {\Gamma} for each triple of terms {x,y,z}, as well as substitution axioms such as {(x =' y) \implies (B(x,z) =' B(y,z))} for any terms {x,y,z} and binary functions {B}. It is straightforward to verify that if {\Gamma} is consistent, then {\Gamma'} is also consistent, because any contradiction derived in {\Gamma'} can be translated to a contradiction derived in {\Gamma} simply by replacing {='} with {=} throughout and using the axioms of equality. By hypothesis, we conclude that {\Gamma'} has some model {{\mathfrak U}'}. By the axioms of equality, the interpretation {(=')^{{\mathfrak U}'}} of {='} in this model is then an equivalence relation on the domain {\hbox{Dom}({\mathfrak U}')} of {{\mathfrak U}'}. One can also remove from the domain of {{\mathfrak U}'} any element which is not of the form {t^{{\mathfrak U}'}} for some term {t}, as such “inaccessible” elements will not influence the satisfiability of {\Gamma'}. We can then define a structure {{\mathfrak U}} for the original language {{\mathfrak L}} by quotienting the domain of {{\mathfrak U}'} by the equivalence relation {='}, and also quotienting all the interpretations of the relations and operations of {{\mathcal L}}; the axioms of equality ensure that this quotienting is possible, and that the quotiented structure {{\mathfrak U}} satisfies {{\mathcal L}}; we omit the details.

Henceforth we assume that {{\mathcal L}} does not contain the equality sign. We will then choose a “tautological” domain of discourse {\hbox{Dom}({\mathfrak U})}, by setting this domain to be nothing more than the collection of all terms in the language {{\mathcal L}}. For instance, in the language of groups on six generators, the domain {\hbox{Dom}({\mathfrak U})} is basically the free magma (with “inverse”) on six generators plus an “identity”, consisting of terms such as {(a_1 \cdot a_2)^{-1} \cdot a_1}, {(e \cdot a_3) \cdot ((a_4)^{-1})^{-1}}, etc. With this choice of domain, there is an obvious “tautological” interpretation of constants ({c^{\mathfrak U} := c}) and operations (e.g. {B^{\mathfrak U}(t_1, t_2) := B(t_1,t_2)^{\mathfrak U}} for binary operations {B} and terms {t_1,t_2}), which leads to every term {t} being interpreted as itself: {t^{\mathfrak U} = t}.

It remains to figure out how to interpret the propositional variables {A_1, A_2, \ldots} and relations {R_1, R_2, \ldots}. Actually, one can replace each relation with an equivalent collection of new propositional variables by substituting in all possible terms in the relation. For instance, if one has a binary relation {R(,)}, one can replace this single relation symbol in the language by a (possibly infinite) collection of propositional variables {R(t_1,t_2)}, one for each pair of terms {t_1,t_2}, leading to a new (and probably much larger) language {\tilde L} without any relation symbols. It is not hard to see that if theory {\Gamma} is consistent in {{\mathcal L}}, then the theory {\tilde \Gamma} in {\tilde L} formed by interpreting all atomic formulae such as {R(t_1,t_2)} as propositional variables is also consistent. If {\tilde \Gamma} has a model {\tilde {\mathfrak U}} with the tautological domain of discourse, it is not hard to see that this can be converted to a model {{\mathfrak U}} of {\Gamma} with the same domain by defining the interpretation {R^{\mathfrak U}} of relations {R} in the obvious manner.

So now we may assume that there are no relation symbols, so that {\Gamma} now consists entirely of propositional sentences involving the propositional variables. But the claim now follows from the completeness theorem in propositional logic. \Box

Remark 3 The above proof can be viewed as a combination of the completeness theorem in propositional logic and the familiar procedure in algebra of constructing an algebraic object (e.g. a group) that obeys various relations, by starting with the free version of that object (e.g. a free group) and then quotienting out by the equivalence relation generated by those relations.

Remark 4 Observe that if {{\mathfrak L}} is at most countable, then the structures {{\mathfrak U}} constructed by the above procedure are at most countable (because the set of terms is at most countable, and quotienting by an equivalence relation cannot increase the cardinality). Thus we see (as in Theorem 1 or Corollary 2) that if a zeroth-order theory in an at most countable language is satisfiable, then it is in fact satisfiable with an at most countable model.

From the completeness theorem for zeroth-order logic and the above remark we obtain the compactness theorem for zeroth-order logic, which is formulated exactly as in Corollary 2.

— 3. First-order logic —

We are now ready to study languages which are expressive enough to do some serious mathematics, namely the languages of first-order logic, which are formed from zeroth-order logics by adding variables and quantifiers. (There are higher-order logics as well, but unfortunately the completeness and compactness theorems typically fail for these, and they will not be discussed here.)

A language {{\mathcal L}} for a first-order logic consists of the following:

  • A (finite or infinite) collection {A_1, A_2, A_3, \ldots} of propositional variables;
  • A collection {R_1, R_2, R_3, \ldots} of relations, with each {R_i} having an arity {a[R_i]};
  • A collection {c_1, c_2, c_3, \ldots} of constants;
  • A collection {f_1, f_2, f_3, \ldots} of operators, with each {f_i} having an arity {a[f_i]};
  • A collection {x_1, x_2, x_3, \ldots} of variables;
  • Logical connectives;
  • The quantifiers {\forall, \exists};
  • Parentheses;
  • Optionally, the equals sign {=}.

For instance, the language for Peano arithmetic includes a constant {0}, a unary operator {S}, binary operators {+, \times}, the equals sign {=}, and a countably infinite number of variables {x_1, x_2, \ldots}.

By combining constants, variables and operators together one creates terms; by inserting terms into predicates or relations, or using propositional variables, one obtains atomic formulae. These atomic formulae can contain a number of free variables. Using logical connectives as well as quantifiers to bind any or all of these variables, one obtains well-formed formulae; a formula with no free variables is a sentence. Thus, for instance, {\forall x_2: x_1+x_2=x_2+x_1} is a well-formed formula, and {\forall x_1 \forall x_2: x_1+x_2=x_2+x_1} is a sentence.

A structure {{\mathfrak U}} for a first-order language {{\mathfrak L}} is exactly the same concept as for a zeroth-order language: a domain of discourse, together with an interpretation of all the constants, operators, propositional variables, and relations in the language. Given a structure {{\mathfrak U}}, one can interpret terms {t} with no free variables as elements {t^{\mathfrak U}} of {\hbox{Dom}({\mathfrak U})}, and interpret sentences {\phi} as truth values {\phi^{\mathfrak U} \in \{ \hbox{true}, \hbox{false}\}}, in the standard fashion.

A theory is, once again, a collection of sentences in the first-order language {{\mathcal L}}; one can define what it means for a structure to satisfy a sentence or a theory just as before.

Remark 5 In most fields of mathematics, one wishes to discuss several types of objects (e.g. numbers, sets, points, group elements, etc.) at once. For this, one would prefer to use a typed language, in which variables, constants, and functions take values in one type of object, and relations and functions take only certain types of objects as input. However, one can easily model a typed theory using a typeless theory by the trick of adding some additional unary predicates to capture type (e.g. {N(x)} to indicate the assertion “{x} is a natural number”, {S(x)} to indicate the assertion “{x} is a set”, etc.) and modifying the axioms of the theory being considered accordingly. (For instance, in a language involving both natural numbers and other mathematical objects, one might impose a new closure axiom {\forall x \forall y: N(x) \wedge N(y) \implies N(x+y)}, and axioms such as the commutativity axiom {\forall x \forall y: x+y=y+x} would need to be modified to {\forall x \forall y: N(x) \wedge N(y) \implies x+y=y+x}.) It is a tedious but routine matter to show that the completeness and compactness theorems for typeless first-order logic imply analogous results for typed first-order logic; we omit the details.

To prove the completeness (and hence compactness) theorem, it suffices as before to show that

Proposition 9 Let {\Gamma} be a consistent first-order theory, with an at most countable language {{\mathcal L}}. Then {\Gamma} has at least one model {{\mathcal U}}, which is also at most countable.

We shall prove this result using a series of reductions. Firstly, we can mimic the arguments in the zeroth-order case and reduce to the case when {{\mathcal L}} does not contain the equality symbol. (We no longer need to restrict the domain of discourse to those elements which can be interpreted by terms, because the universal quantifier {\forall} is now available for use when stating the axioms of equality.) Henceforth we shall assume that the equality symbol is not present in the language.

Next, by using the laws of first-order logic to push all quantifiers in the sentences in {\Gamma} to the beginning (e.g. replacing {(\forall x: P(x)) \wedge (\forall y: Q(y))} with {\forall x \forall y: P(x) \wedge Q(y)}) one may assume that all sentences in {\Gamma} are in prenex normal form, i.e. they consist of a “matrix” of quantifiers, followed by an quantifier-free formula – a well-formed formula with no quantifiers. For instance, {\forall x \exists y \forall z \exists w: P(x,y,z,w)} is in prenex normal form, where {P(x,y,z,w)} is an quantifier-free formula with four free variables {x,y,z,w}.

Now we will start removing the existential quantifiers {\exists} from the sentences in {\Gamma}. Let’s begin with a simple case, when {\Gamma} contains a sentence of the form {\exists x: P(x)} for some quantifier-free formula of one free variable {x}. Then one can eliminate the existential quantifier by introducing a witness, or more precisely adjoining a new constant {c} to the language {{\mathcal L}} and replacing the statement {\exists x: P(x)} with the statement {P(c)}, giving rise to a new theory {\Gamma'} in a new language {\Lambda'}. The consistency of {\Gamma} easily implies the consistency of {\Gamma'}, while any at most countable model for {\Gamma'} can be easily converted to an at most countable model for {\Gamma} (by “forgetting” the symbol {c}). (In fact, {\Gamma'} is a conservative extension of {\Gamma}.) We can apply this reduction simultaneously to all sentences of the form {\exists x: P(x)} in {\Gamma} (thus potentially expanding the collection of constants in the language by a countable amount).

The same argument works for any sentence in prenex normal form in which all the existential quantifiers are to the left of the universal quantifiers, e.g. {\exists x \exists y \forall z \forall w: P(x,y,z,w)}; this statement requires two constants to separately witness {x} and {y}, but otherwise one proceeds much as in the previous paragraph. But what about if one or more of the existential quantifiers is buried behind a universal quantifier? The trick is then to use Skolemisation. We illustrate this with the simplest case of this type, namely that of a sentence {\forall x \exists y: P(x,y)}. Here, one cannot use a constant witness for {y}. But this is no problem: one simply introduces a witness that depends on {x}. More precisely, one adjoins a new unary operator {c} to the language {{\mathcal L}} and replaces the statement {\forall x \exists y: P(x,y)} by {\forall x: P(x,c(x))}, creating a new theory {\Gamma'} in a new language {\Lambda'}. One can again show (though this is not entirely trivial) that the consistency of {\Gamma} implies the consistency of {\Gamma'}, and that every countable model for {\Gamma'} can be converted to a countable model for {\Gamma} (again by “forgetting” {c}). So one can eliminate the existential quantifier from this sentence also. Similar methods work for any other prenex normal form; for instance with the sentence

\displaystyle \forall x \exists y \forall z \exists w: P(x,y,z,w)

one can obtain a conservative extension of that theory by introducing a unary operator {c} and a binary operator {d} and replacing the above sentence with

\displaystyle \forall x \forall z: P( x, c(x), z, d( x, z) ).

One can show that one can perform Skolemisation on all the sentences in {\Gamma} simultaneously, which has the effect of eliminating all existential quantifiers from {\Gamma} while still keeping the language {{\mathcal L}} at most countable (since {\Gamma} is at most countable). (Intuitively, what is going on here is that we are interpreting all existential axioms in the theory as implicitly defining functions, which we then explicitly formalise as a new symbol in the language. For instance, if we had some theory of sets which contained the axiom of choice (every family of non-empty sets {(X_\alpha)_{\alpha \in A}} admits a choice function {f: A \rightarrow \bigcup_{\alpha \in A} X_\alpha}), then we can Skolemise this by introducing a “choice function function” {{\mathcal F}: (X_\alpha)_{\alpha \in A} \mapsto {\mathcal F}( (X_\alpha)_{\alpha \in A} )} that witnessed this axiom to the language. Note that we do not need uniqueness in the existential claim in order to be able to perform Skolemisation.)

After performing Skolemisation and adding all the witnesses to the language, we are reduced to the case in which all the sentences in {\Gamma} are in fact universal statements, i.e. of the form {\forall x_1 \ldots \forall x_k: P(x_1,\ldots,x_k)}, where {P(x_1,\ldots,x_k)} is an quantifier-free formula of {k} free variables. In this case one can repeat the zeroth-order arguments, selecting a structure {{\mathfrak U}} whose domain of discourse is the tautological one, indexed by all the terms with no free variables (in particular, this structure will be countable). One can then replace each first-order statement {\forall x_1 \ldots \forall x_k: P(x_1,\ldots,x_k)} in {\Gamma} by the family of zeroth-order statements {P(t_1,\ldots,t_k)}, where {t_1,\ldots,t_k} ranges over all terms with no free variables, thus creating a zeroth-order theory {\Gamma_0}. As {\Gamma} is consistent, {\Gamma_0} is also, so by the zeroth-order theory, we can find a model {{\mathfrak U}} for {\Gamma_0} with the tautological domain of discourse, and it is clear that this structure will also be a model for the original theory {\Gamma}. The proof of the completeness theorem (and thus the compactness theorem) is now complete.

In summary: to create a countable model from a consistent first-order theory, one first replaces the equality sign {=} (if any) by a binary relation {='}, then uses Skolemisation to make all implied functions and operations explicit elements of the language. Next, one makes the zeroth-order terms of the new language the domain of discourse, applies a greedy algorithm to decide the truth assignment of all zeroth-order sentences, and then finally quotients out by the equivalence relation given by {='} to recover the countable model.

Remark 6 I find the use of Skolemisation to greatly clarify, at a conceptual level, the proof of the completeness theorem. However, at a technical level it does make things more complicated: in particular, showing that the Skolemisation of a consistent theory is still consistent does require some non-trivial effort (one has to take all arguments involving the Skolem function {c()}, and replace every occurence of {c()} by a “virtual” function, defined implicitly using existential quantifiers). On the other hand, this fact is easy to prove once one already has the completeness theorem, though we of course cannot formally take advantage of this while trying to prove that theorem!

The more traditional Henkin approach is based instead on adding a large number of constant witnesses, one for every existential statement: roughly speaking, for each existential sentence {\exists x: P(x)} in the language, one adds a new constant {c} to the language and inserts an axiom {(\exists x: P(x)) \implies P(c)} to the theory; it is easier to show that this preserves consistency than it is with a more general Skolemisation. Unfortunately, every time one adds a constant to the language, one increases the number of existential sentences for which one needs to perform this type of witnessing, but it turns out that after applying this procedure a countable number of times, one can get to the point where every existential sentence is automatically witnessed by some constant. This has the same ultimate effect as Skolemisation, namely one can convert sentences containing existential quantifiers to ones which are purely universal, and so the rest of the proof is much the same as the proof described above. On the other hand, the Henkin approach avoids the axiom of choice (though one still must use the ultrafilter lemma, of course).