Printer-friendly CSS, and nonfirstorderisability

27 August, 2007 in admin, expository, math.LO | Tags: CSS, first-order logic, quantifiers | by Terence Tao

I recently discovered a CSS hack which automatically makes wordpress pages friendlier to print (stripping out the sidebar, header, footer, and comments), and installed it on this blog (in response to an emailed suggestion). There should be no visible changes unless one “print previews” the page.

In order to prevent this post from being totally devoid of mathematical content, I’ll mention that I recently came across the phenomenon of nonfirstorderisability in mathematical logic: there are perfectly meaningful and useful statements in mathematics which cannot be phrased within the confines of first order logic (combined with the language of set theory, or any other standard mathematical theory); one must use a more powerful language such as second order logic instead. This phenomenon is very well known among logicians, but I hadn’t learned about it until very recently, and had naively assumed that first order logic sufficed for “everyday” usage of mathematics.

Let’s begin with some simple examples of statements which can be expressed in first-order logic. If B(x,y) is a binary relation on two objects x, y, then we can express the statement

For every x, there exists a y depending on x such that B(x,y) is true

in first order logic as

$\forall x \exists y: B(x,y)$

whereas the statement

For every x, there exists a y independent of x such that B(x,y) is true

can be expressed as

$\exists y \forall x: B(x,y).$

Moving on to a more complicated example, if Q(x,x’,y,y’) is a quaternary relation on four objects x,x’,y,y’, then we can express the statement

For every x and x’, there exists a y depending only on x and a y’ depending on x and x’ such that Q(x,x’,y,y’) is true

$\forall x \exists y \forall x' \exists y': Q(x,x',y,y')$

(note that this allows y’ to depend on y also, but this turns out to be moot, because y depends only on x), and one can similarly express

For every x and x’, there exists a y depending on x and x’ and a y’ depending only on x’ such that Q(x,x’,y,y’) is true

$\forall x' \exists y' \forall x \exists y: Q(x,x',y,y')$

but it seems that one cannot express

For every x and x’, there exists a y depending only on x and a y’ depending only on x’ such that Q(x,x’,y,y’) is true (*)

in first order logic! For instance, the statement

To every finitely generated real vector space V one can associate a unique non-negative integer $\dim(V)$ such that

V, W are isomorphic if and only if $\dim(V) = \dim(W)$ ;

an injection from V to W exists if and only if $\dim(V) \leq \dim(W)$ ;

a surjection from V to W exists if and only if $\dim(V) \geq \dim(W)$ ;

$\dim({\mathbb R}) = 1$ ; and

$\dim(V \oplus W) = \dim(V) + \dim(W)$ for all V, W,

which is part of the fundamental theorem of linear algebra, does not seem to be expressible as stated in first order set theory (though of course the concept of dimension can be explicitly constructed within this language), even if we drop the uniqueness and restrict ourselves to just the assertion that $\dim()$ obey, say, property 1, so that we get an assertion of the form (*). Note that the category of all finite-dimensional vector spaces is not a set (for reasons relating to Russell’s paradox) and so we cannot view $\dim$ as a function. More generally, many statements in category theory dealing with large categories seem to not be expressible in first order logic.

I can’t quite show that (*) is not expressible in first-order logic, but I can come very close, using non-standard analysis. The statement

For every real numbers x and x’ there exists real numbers st(x) and st(x’) depending only on x and x’ respectively, such that st(x+x’) = st(x)+st(x’), st(xx’) = st(x) st(x’), st(1)=1, and st(x) is non-negative whenever x is non-negative, and also such that st(x) is not always equal to x

is true in the non-standard model of the real numbers, but false in the standard model (this is the classic algebra homework problem that the only order-preserving field homomorphism on the reals is the identity). Since the transfer principle ensures that all first-order statements that are true in the standard reals are also true in the non-standard reals, this means that the above statement cannot be expressed in first-order logic. If it weren’t for the “st(x) is not always equal to x” part, this would basically be of the form (*).

It seems to me that first order logic is limited by the linear (and thus totally ordered) nature of its sentences; every new variable that is introduced must be allowed to depend on all the previous variables introduced to the left of that variable. This does not fully capture all of the dependency trees of variables which one deals with in mathematics. In analysis, we tend to get around this by using English phrasings such as

… assuming N is chosen sufficiently large depending on $\epsilon$ , and $\delta$ chosen sufficiently small depending on N …

and

… where C can depend on k and d, but is uniform with respect to n and f …,

or by using the tremendously convenient O() and o() notation of Landau. One then takes for granted that one can eventually unwind all these phrasings to get back to a sentence in formal, first-order logic. As far as analysis is concerned, this is a fairly safe assumption, since one usually deals with objects in very concrete sets such as the real numbers, and one can easily model all of these dependencies using functions from concrete sets to other concrete sets if necessary. (Also, the hierarchy of magnitudes in analysis does often tend to be rather linearly ordered.) But some subtleties may appear when one deals with large categories, such as the category of sets, groups, or vector spaces (though in most applications, one can cap the cardinality of these objects and then one can represent these categories up to equivalence by an actual set). It may be that a more diagrammatic language (perhaps analogous to the commutative diagrams in category theory, or one based on trees or partially ordered sets rather than linearly ordered ones) may be a closer fit to expressing the way one actually thinks about how variables interact with each other. (Second-order logic is, of course, an obvious candidate for such a language, but it may be overqualified for the task. And, in practice, there’s nothing wrong with just using plain old mathematical English.)

[Update, Aug 27: bad link fixed.]

53 comments

Comments feed for this article

27 August, 2007 at 8:36 am

Suresh Venkat

Hi,
It’s worth mentioning here that first and second order logic are closely related to complexity classes, via an area of complexity theory called descriptive complexity. A famous result proved by Ron Fagin showed that NP is precisely the class of languages expressible by predicates in existential second order logic, a result that is very intriguing in how it connects “syntax” and “algorithmic complexity”. Other results include the characterization of P by first order logic + a fixed point operator + an ordering operator, and results for lower complexity classes, many of these results shown by Neil Immerman.

http://en.wikipedia.org/wiki/Descriptive_complexity

27 August, 2007 at 10:08 am

Oleg Izhvanov

Hi,
A technical note: the last link in the post is broken.

“And, in practice, there’s nothing wrong with
just using plain old mathematical English”.

27 August, 2007 at 1:14 pm

Terence Tao

Dear Suresh and Oleg,

Thanks for the references and corrections!

28 August, 2007 at 12:26 am

Ori Gurel-Gurevich

“Note that the category of all finite-dimensional vector spaces is not a set (for reasons relating to Russell’s paradox) and so we cannot view \dim as a function. More generally, many statements in category theory dealing with large categories seem to not be expressible in first order logic.”

That is true if we restrict ourselves to ZFC. However, an extended set theory like NBG is more suitable for such things. This would allow one to talk about such class functions as $\dim$ without resorting to second order theories. There’s also the additional benefit of NBG being finitely axiomatizable.

28 August, 2007 at 9:26 am

Andy D

I believe I can show that the statement (*) Terry was discussing is not FO-definable for a particular FO-definable quaternary relation Q(x, x’, y, y’).

Namely, let Q(x, x’, y, y’) be

(y != x) and [(x = x’) implies (y = y’)] and [ (x != x’) implies (y != y’)].

Then (somebody please check this) statement (*) is true iff there exists a perfect matching over the universe’s elements.

Any infinite universe has such a matching; so the negation of (*) is true iff the universe is finite and has an odd number of elements (and the class of FO-definable properties is closed under negation). But universe parity is known not to be FO-definable.

FO-inexpressibility results are to my knowledge generally shown by one of two approaches:

i) analysis of the associated ‘Ehrenfeucht-Fraisse-Game’, an approach which can show inexpressibility of parity;

ii) reduction to a known FO-inexpressible property, as above. The most common reduction targets seem to be parity/counting properties, and graph connectivity.

28 August, 2007 at 10:34 am

Kenny

I’d heard of branching quantifiers before, but didn’t know what they were good for (except that some linguists, like Hintikka, have suggested that they might be good for analyzing natural language). It’s good to have some examples now!

28 August, 2007 at 12:47 pm

David Corfield

As Kenny’s link mentions, branching quantifiers have been treated by Hintikka in his ‘independence-friendly’ logic. The formula with four variable mentioned corresponds to a 2 vs 2 player game with incomplete information. Samson Abramsky, the theoretical computer scientist, has done some interesting work looking into the syntax and semantics of logics corresponding to multi-player games, with one ulterior aim being the modelling of concurrency.

28 August, 2007 at 1:52 pm

Andy D

OK, looks like what I wrote is subsumed by the results mentioned on Q_H in Kenny’s reference. (*) seems like a surprisingly powerful schema.

28 August, 2007 at 10:43 pm

Kevin O'Bryant

Here’s another example dear to your heart. For each positive integer k, the statement, “the set A contains an arithmetic progression of length k” is transparently FO, while the statement “for each positive integer k, the set A contains an arithmetic progression of length k” is not (at least not in any straightforward manner).

A famously non-FO statement is “Every set with an upper bound has a least upper bound”, which is true for the reals, but not true for the non-standard reals, and so cannot be FO. This is, it seems to me, part of the undertapped mojo of nonstandard analysis (beyond epsilon management). You get access to different second order theorems (you lose a few but also gain a few) to help you prove first order theorems.

29 August, 2007 at 6:03 am

Terence Tao

Dear Kevin,

I’m not sure I understand your examples, at least if I am interpreting “set” in an internal sense. Wouldn’t

$\forall k > 0: \exists a,r: ((r \neq 0) \wedge (\forall 0 \leq j \leq r-1: a+jr \in A))$

be a first-order sentence that asserts that A contains arbitrarily long progressions? Similarly, the statement that every set with an upper bound has a least upper bound is still true for the non-standard reals if one restricts attention to non-standard sets of reals (as opposed to sets of non-standard reals). If one interprets the concept of “set” internally then

$\forall A \subset {\Bbb R}: ((A \neq \emptyset) \wedge \exists x: \forall y \in A: y \leq x) \implies (\exists z: (\forall y \in A: y \leq z) \wedge (\forall x: (\forall y \in A: y \leq x) \implies (x \geq z)))$

seems to be a first-order sentence (albeit a rather ugly one) that asserts that every (non-empty) set with an upper bound has a least upper bound.

29 August, 2007 at 11:32 am

Kenny

I think in your first example you mean $\forall 0\leq j\leq k-1$ rather than $\forall 0\leq j\leq r-1$ .

As for the second example, it’s always sort of sneaky to quantify over sets in a first-order way – when you do that, you get non-standard models where the objects are just fine, but you’re missing some of the sets or something. I know people working on reverse mathematics (like Harvey Friedman and Steven Simpson at various points) do that with models of “second-order” arithmetic (really just first-order arithmetic where they use first-order quantification over sets as well) to prove that certain standard theorems are independent of weakened axiom systems.

29 August, 2007 at 2:30 pm

Scott McKuen

Terry’s first example (mod Kenny) looks good if we’re willing to accept infinitely-long arithmetric progressions in the nonstandard models (since the variable k then runs over nonstandard positive integers as well). If we just want arbitrarily-long-but-finite progressions, which I think is what Kevin intended, then it fails.

If we’re working outside the theory of arithmetic, say in fields, and we don’t have a predicate to designate the integers as a definable set, then it doesn’t work.

29 August, 2007 at 11:12 pm

Emmanuel Kowalski

Restricting the “domain” of quantification (I’m not sure which is the
technically correct word in logic/model theory) may make a huge
difference in what sets first order theory can define. One very cute
example, which does not seem to be so well-known, considers definable
sets in the first order theory of rings (+, -, *, 0, 1, …) applied
to finite fields: if we have a formula $\varphi(x)$ with
$x=(x_1,\ldots, x_n)$ , it has been shown by Chatzidakis, van den Dries
and Macintyre (J. Reine angew. Math. 427 (1992), 107–135) that the
number of “solutions” in a finite field $\mathbf{F}_q$ (i.e., the
number of $x$ in $\mathbf{F}_q^n$ for which the formula is “true”,
with the standard interpretation of the language, and all quantifiers
implicitly running over $\mathbf{F}_q$ only) is always (either 0 or)
of the type

$\delta q^{d}+O(q^{d-1/2})$

where $\delta$ is a rational number (which may depend on $q$ , but can only
take finitely many values for a given $\varphi$ ), and $d\geq 0$ is an
integer with the same property. This is of course reminiscent (and
ultimately derived from) the Lang-Weil estimate for the number of points
on an algebraic variety over a finite field.

So in particular the subfield $\mathbf{F}_p$ of $\mathbf{F}_{p^2}$ is
not definable in the first order theory of rings, i.e., there is no
formula $\varphi(x)$ , for which the set of “solutions” in
$\mathbf{F}_{p^2}$ is $\mathbf{F}_p$ for infinitely many $p$ . (Come to
think of it, I’m not sure how using second order theory of rings could
help here; of course, adding something like the Frobenius automorphism
in the language would solve the problem).

(Note: one can even add
exponential sums into the mixture and show cancellation statements
in exponential sums over such definable sets, similar to the results
of Weil and Deligne on exponential sums over finite fields — at least
the simplest statements extend –; in particular, it follows that an
“interval” (or an arithmetic progression) in $\mathbf{Z}/p\mathbf{Z}$
is also not definable in this language (unless it’s a fixed number of
elements, or the complement of such a thing). This is not surprising,
but for intervals with positive density one needs more than the result
of C-vdD-M — namely some additive characters do not cancel
appreciably over a long interval/arithmetic progression).

30 August, 2007 at 7:29 pm

Anonymous

I’m confused about how the 2×2 branching quantifier

$(Q_Hx_1,x_2,y_1,y_2) \equiv\begin{pmatrix}\forall x_1 \exists y_1\\ \forall x_2 \exists y_2\end{pmatrix}$

is supposed to support the “there are infinitely many” quantifier. Here’ s the quantifier-free clause in question for a generic formula (phi):

$\psi(a,x_1,x_2,y_1,y_2)\equiv[\phi a\land (x_1=x_2 \leftrightarrow y_1=y_2) \land (\phi (x_1)\rightarrow (\phi (y_1)\land y_1\neq a))]$

The three parts appear to say

1) something satisfies the formula;

2) if you have one satisfier, you can get another that isn’t just the starting point (no loops that start at “a”);

3) the map from one satisfier to the next is a bijection (no loops can start later on, either).

Why does quantifying this way:

$(Q_{\geq\mathbb{N}}x)\phi (x)\equiv\exists a\begin{pmatrix}\forall x_1 \exists y_1\\ \forall x_2 \exists y_2\end{pmatrix} \psi(a,x_1,x_2,y_1,y_2)$

force “there are infinitely many” but quantifying, say, like this:

$\exists a \forall x_1 \exists y_1\\ \forall x_2 \exists y_2 \psi (a,x_1,x_2,y_1,y_2)$

does not? Can we exhibit a finite model of the first-order version?

30 August, 2007 at 10:05 pm

Terence Tao

Dear anonymous,

I’m assuming that your final display was of a standard first-order sentence rather than a branching quantifier one, despite the line break. In that case, the problem with that sentence is that $y_2$ is allowed to depend on $x_1, y_1$ as well as $a, x_2$ ; once you allow that, the clause $\psi(a,x_1,x_2,y_1,y_2)$ is very easy to satisfy even for a model in which $\phi$ is only satisfied at two values, a and b; one just sets $y_1$ to always equal b, and $y_2$ to equal $b$ when $x_1=x_2$ , and a otherwise.

It seems that the branching quantifier offers a way to define the concept of a class function $x \mapsto y$ in a manner which can’t be replicated in first order logic, because one needs to assert that equal inputs force equal outputs, and you can’t do this if one of the outputs is allowed to depend on the other input.

30 August, 2007 at 11:17 pm

Scott McKuen

Hi, Terry,

That was my anonymous bad LaTeX above – yes, no line break intended. Thanks for the example – now it’s clear why the bijection isn’t enforced. Rewriting the thing with Skolem functions helped, too.

31 August, 2007 at 6:53 am

Walt

While I find non-firstorderability very interesting, I think it’s the norm rather than the exception, which is why we use set theory rather than first-order models for everything. For example, first-order logic can’t express the Archimedean axiom, or the Riemann hypothesis. Or am I misunderstanding?

31 August, 2007 at 8:05 am

Terence Tao

Dear Walt,

If you combine first-order logic with a standard set theory such as ZFC, then I agree that you can describe, say, Riemann hypothesis (though I won’t try to write down the actual sentences here, they would be atrociously long), because one can use set theory to model objects such as the zeta function. (The Archimedean axiom seems to me to be expressible in the first-order language of the integers and reals without any need for set theory.) What was surprising, to me at least, was that even with ZFC, statements such as the existence of an integer dimension for each finitely generated vector space do not seem to be easily firstorderisable; set theory doesn’t directly help here because it cannot describe class functions such as $\hbox{dim}$ unless one performs some trickery to reduce the size of the category of vector spaces to the point where it can be modeled by a set rather than a class.

As Ori pointed out though, if one uses von Neumann-Bernays-Godel theory instead of ZFC, one can deal with classes and class functions more easily and many of these issues seem to disappear. Since NBG and ZFC are equivalent when restricted to sentences that refer only to sets, the nonfirstorderisability issue is mostly moot, but it still is more subtle than one might naively first imagine.

31 August, 2007 at 10:34 am

Walt

I see what you’re getting at. I think the ZFC-ists answer would be to translate it into a collection of statements, parametrized by ordinals, so that for each ordinal a you can state and prove that dim exists for the set V_a in the von Neumann heirarchy. (That’s probably what you mean by “some trickery”.)

31 August, 2007 at 1:48 pm

Walt

Actually, I think you can basically formalize this in ZFC. You can’t formalize your statement exactly, of course, since it involves branching quantifiers, but you can formalize a statement in ZFC that implies yours. I don’t know much about branching quantifiers, though, so I could be getting this totally wrong.

Let P(V,n) be the propositional formula that specifies that V is a finite-dimensional vector space, and n is its unique dimension. Then you should be able to prove a statement like: (all V)(exists n)P(V,n) and ( (all V)(exists n)P(V,n) implies Q(V,V’,n,n’) ), where Q is one of your theorems about vector spaces.

I think you can show this will imply your branching quantifier statement by Skolemization. P(V,n) turns into P(V, n(V)), Q(V,V’,n,n’) would turn into Q(V, V’, n(V, V’), n'(V, V’)), but you can prove by the uniqueness of dimension that n(V,V’) = n(V) and n'(V, V’) = n'(V’). (Though as I said, I don’t know much about branching quantifiers, so this may be more of a parade of my ignorance than anything else.)

1 September, 2007 at 2:10 pm

Kevin O'Bryant

Regarding the first-orderizability of containing arbitrarily long arithmetic progressions…

Terry’s sentence
$\forall k>0 : \exists a,r : ((r \not=0) \wedge (\forall 0 \leq j \leq k-1 : a+jr \in A))$
only captures what we mean by arithmetic progressions in some settings. The first quantifier $\forall k>0$ is inherently referring to the natural numbers, while the other quantifiers $\exists a,r$ are referring to elements of the model. If the only models we’re interested in contain the ring of integers (so that $a+jr$ makes the usual sense), then it’s fine.

The problem with this arises because we talk about arithmetic progressions in general abelian groups. In a group the statement $a+jr$ does not make sense. For $j=4$ , for example, we can replace $a+jr$ with $a+r+r+r+r$ , and so for each particular $j$ we can make it make sense. But to make it make sense for all $j$ , we need to quantify over the natural numbers. By definition “first order” means that we quantify only over elements of the model.

For example, we all concur precisely on the meaning of “arithmetic progression” in the group of rational points on an elliptic curve. The set $A$ contains $k$ terms in arithmetic progression (for a natural number k) if $\exists a,r : a\in A \wedge a+r \in A \wedge \dots \wedge a+r+r+\dots +r \in A$ . For any particular natural number $k$ , those dots represent finite strings. But without the natural numbers intrinisically available to us, we cannot quantify over $k$ .

1 September, 2007 at 7:58 pm

Anonymous

Do we ever talk about axiom systems that are modelled by elliptic curves?! I thought an elliptic curve was just another set sitting in ZFC or NBG or NF or whatever. Therefore, normally when we speak of elliptic curves, they are inside our set theory, and the integers are available to us.

2 September, 2007 at 1:45 am

Emmanuel Kowalski

Working with axioms systems like “first order theory of [something]s”,
where [something] could be groups, rings, elliptic curves, etc, is
definitely something important. I’m not sure about elliptic curves,
but the first-order theory of rings (and fields, which can be defined
in this language) has many applications in algebra and number theory.
The fact that some sets can be defined in such a way can have
important consequences for applications. This can be thought-of (maybe) as an
analogue of “smoothness” for functions: not all functions are
smooth, but if you know that one is, it can be very useful! E.g., the
unit group of a ring can be defined by $\exists y, xy=1$ , but not the
torsion subgroup of the unit group, because there is no a priori upper
bound on the order of a torsion unit — note of course if we can
invoke integers, the torsion subgroup is easy to define…

As an example of application, using the results of Chatzidakis, van
den Dries and Macintyre I mentioned in my previous comment on
definable sets in finite fields (and other logical things I don’t
understand, including ultrafilters), Hrushovsky and Pillay managed to
prove the following theorem (Crelle 462, 1995, p. 69–91, Prop. 7.3):

“Let $\Gamma$ be any Zariski-dense finitely generated subgroup of a
(almost simple, simply-connected) algebraic group $G$ defined over
$\mathbf{Z}$ , for instance $G=GL(n)$ , or $G=SL(n)$ or $G=Sp(2g)$ ; then
for almost all primes $p$ , the reduction map $\Gamma\rightarrow G(\mathbf{Z}/p\mathbf{Z})$ is surjective.”

Notice the statement is purely algebraic and very natural (and
beautiful). In fact it was proved earlier by Matthews, Vaserstein and
Weisfeiler, who used the classification of finite simple groups! Even
for $SL(n)$ , this is not so obvious to prove by hand, and for a
Zariski-dense subgroup, which can be a very thin subset, this is
really deep.

4 September, 2007 at 5:56 am

Logic games « Forking, Forcing and Back-and-Forthing

[…] a recent post, Terence Tao evokes statements that cannot be formalized in first-order logic. David Corfield has […]

4 September, 2007 at 12:33 pm

Terence Tao

Dear Walt,

Your statement concerning dimension is a first order one as long as you specify what your predicate P(V,n) is – i.e. you give an explicit definition of the concept of dimension. If you don’t actually want to write down this definition, and instead want to assert the abstract existence of a concept of dimension which satisfies various properties, then you need to pass to second-order logic (as you will need a second-order quantifier such as $\exists P$ ).

Dear Kevin,

Thanks for the clarification. I guess the moral is that statements involving integers can require the integers to be part of the language in order to express them in first-order logic; similarly statements involving sets or functions may require set theory to be firstorderisable; and statements involving class functions may require a language that can express the concept of a class, such as NBG.

6 September, 2007 at 7:47 am

Walt

The definition of firstorderizability, as far as I understand it, is that a sentence in a theory is firstorderizable if its truth value (as you range over models of the theory) is the same as a first-order sentence. While your sentence requires second-order logic to express, its truth value over models of ZFC is the same as the sentence using the explicit definition. In that sense, your sentence is firstorderizable.

6 September, 2007 at 8:30 am

Terence Tao

Dear Walt,

Technically you are correct, but by this definition any statement that one can actually prove in ZFC becomes firstorderizable for trivial reasons, because it has the same truth value as a tautology such as 0=0. But one can easily tweak the sentence to avoid this, by replacing the question “do vector spaces have a unique notion of dimension?” to “does a [insert vector-space like object here] have a unique notion of dimension?”, where we view the concept of a vector-space like object as a primitive. For instance, one could imagine that one has a class of vector-space-like objects which form an abelian category with a distinguished “one-dimensional” object (the analogue of ${\Bbb R}$ in the actual vector space example), and one can ask whether this class has a notion of dimension which obeys all the properties listed above (replacing surjections by epimorphisms, replacing ${\Bbb R}$ by the distinguished one-dimensional object, and so forth). If we model this class of objects by a unary predicate (e.g. Vec(V) is the statement that V is a vector-space-like object), then the question “Does the class Vec have a unique notion of dimension?” becomes a statement which I believe is not firstorderisable in ZFC, though I might be wrong on this if there is a sufficiently canonical way to define dimension which works for all abelian categories that admit such a notion.

6 September, 2007 at 11:59 am

Walt

I’m only trying to be technically correct. :-)

1 June, 2010 at 1:08 pm

Timothy Chow

It may be worth mentioning that the inability of ZFC to talk about proper classes directly is not as serious a handicap as it might seem at first glance. The standard dodge, as explained for example in Kunen’s book on set theory, is to use formulas to represent (definable) classes. Thus we can’t talk about the class of all vector spaces directly, but we can write down a formula that expresses “x is a vector space” and for almost all purposes this formula serves as a perfectly fine surrogate for the class of all vector spaces. For example, the dimension “function” is most intuitively thought of as a map from the class of all vector spaces to the natural numbers, or in other words as a class of ordered pairs (V,d) where V is a vector space and d is a natural numbers. Though this is a proper class and can’t be “talked about directly” in the first-order language of set theory, we can write down a formula that expresses “d is the cardinality of a basis for V” and use this formula whenever we want to say something about the dimension “function.”

If using formulas as surrogates for classes gets too cumbersome, then another trick is to use a set $V_\kappa$ , where $\kappa$ is a strongly inaccessible cardinal, as a surrogate for the class of all sets. $V_\kappa$ is a model for ZFC, so for any fact about the set-theoretic universe that you might want to prove using the ZFC axioms, there is a corresponding fact about $V_\kappa$ . The point is that $V_\kappa$ is something you can talk about directly since it’s a set. So by pretending that $V_\kappa$ is the class of all sets, you can take all the things you want to say about classes and formalize them as statements about sets in $V_\kappa$ . This can all be done in ZFC + “there exists a strongly inaccessible cardinal” (or maybe a slightly stronger large cardinal axiom if you are doing some really fancy reasoning with classes). Strictly speaking you’re not really talking about proper classes but about all sets in $V_\kappa$ , but since $V_\kappa$ is a model of ZFC, this is typically good enough.

These kinds of tricks demonstrate why a system like Mizar, which is based on ZFC + a large cardinal axiom, is expected to be powerful enough to formalize any mathematical proof we might care to encode in it. If Mizar couldn’t even formalize the dimension of a vector space then clearly it wouldn’t be sufficient to formalize all of mathematics!

4 February, 2011 at 2:15 pm

Vadim Tropashko

Why [formal] conjunction of

“For every x and x’, there exists a y depending only on x and a y’ depending on x and x’ such that Q(x,x’,y,y’) is true”

with

“For every x and x’, there exists a y depending on x and x’ and a y’ depending only on x’ such that Q(x,x’,y,y’) is true”

isn’t equivalent to

“For every x and x’, there exists a y depending only on x and a y’ depending only on x’ such that Q(x,x’,y,y’) is true (*)”

4 February, 2011 at 2:44 pm

Terence Tao

Because the y and y’ in the first sentence need not match up with the y and y’ in the second sentence.

An explicit counterexample is given by the predicate

Q(x,x’,y,y’) = ” $y = y' \in \{x,x'\}$ “.

For this choice of predicate, the first and second of your sentences are true, but the third sentence is false.

4 February, 2011 at 3:50 pm

Vadim Tropashko

All three statements are propositions which are closed formulas involving quaternary predicate Q(_,_,_,_), so my intuition fail to see why variables should match.

In your counter example can you please add parenthesis: is it set membership relation equal to y, or binary identity relation y=y’ being member of a set with two elements? (In second interpretation it is symmetric with respect to y and y’)

Perhaps you have meant ” y = y’ \wedge y’ \in \{x,x’\} “? Still this predicate is symmetric with regards to y and y’.

5 February, 2011 at 12:30 am

Terence Tao

Yes, in my counterexample, Q(x,x’,y,y’) is the assertion that $y=y'$ and that $y' \in \{x,x'\}$ , and is thus symmetric in y and y’. Equivalently, Q(x,x’,y,y’) is the assertion that either $y=y'=x$ or $y=y'=x'$ .

Perhaps it will be easier to understand the distinction between the three sentences by adopting a game-theoretic approach. Consider the following (cooperative) game involving two players. The first player is of age x, and the second player is of age x’. In the game, the first player writes down a number y, and the second player writes down a number y’, and then both numbers are revealed. The players win the game if

(a) the two numbers are the same; AND
(b) this number is either equal to the age x of the first player, or the age x’ of the second player.

Initially, both players know their own age, but not the age of the other player.
The players are allowed to discuss strategy beforehand, but there are three variants of the game, depending on what information can be shared:

Variant 1: the second player may learn the age x of the first player, but the first player may not learn the age x’ of the second player (i.e. the first player can only use x).
Variant 2: the first player may learn the age x’ of the second player, but the second player may not learn the age x of the first player (i.e. the second player can only use x’).
Variant 3: neither player may learn the age of the other (i.e. the first player can only use x, and the second player can only use x’).

It is easy to always win Variant 1: both players write down the number x (which they both know).

It is easy to always win Variant 2: both players write down the number x’ (which they both know).

However, it is not possible in general to win Variant 3, because they do not both know x and they do not both know x’, and so cannot find a strategy that obeys both (a) and (b).

7 February, 2011 at 10:17 am

Vadim Tropashko

OK, let [grudgingly] admit branching quantifier exists:-) Is the expression

all x exists y all xx exists yy R(x,xx,y,yy)
&
all xx exists yy all x exists y R(x,xx,y,yy)

stronger or weaker than

Q_H(x,xx,y,yy) R(x,xx,y,yy)

? Next, your counter example actually satisfies stronger assertions

all x exists y exists yy all xx Q(x,xx,y,yy)

and

all xx exists y exists yy all x Q(x,xx,y,yy)

Is this expressible via FOL + Henkin’s quantifier?

20 December, 2011 at 10:48 am

Allen Mann

Dear Professor Tao,

In your second reply to Vadim (on 5 February 2011) you explain the semantics of a branching quantifier sentence by interpreting the individual quantifiers ( $\forall x$ , $\exists y$ ) as moves in a two-player game. This is the essential insight that led Hintikka and Sandu to develop game-theoretic semantics.

The semantic game for a first-order sentence is a contest between two players. The existential player (Eloise) tries to verify the sentence by choosing the values of existentially quantified variables, while the universal player (Abelard) tries to falsify it by picking the values of universally quantified variables. Disjunctions prompt the verifier to choose which disjunct she wishes to verify; conjunctions prompt the falsifier to pick which conjunct he wishes to falsify. Negation tells the players to switch roles. (To verify $\neg\varphi$ , one must falsify $\varphi$ .)

Notice that, in your example, first two variants of the game are essentially games with perfect information, while the third variant is a game with imperfect information.

First-order logic with imperfect information is an extension of first-order logic that explicitly allows semantic games with imperfect information. There many syntactic devices one can use to indicate what information is available to the players. In independence-friendly (IF) logic, one adds a “slash set” to each quantifier that contains all of the previously quantified variables from which the quantifier is independent. For example, the fact that the universe has infinitely many elements can be expressed by the IF sentence

$\exists w \forall x \bigl( y / \{w\} \bigr) \bigl( \exists z/ \{w,x\} \bigr) [ z = x \land y \not= w ] \qquad (*)$

the Skolemization of which is

$\forall x \Bigl[ g \bigl( f(x) \bigr) = x \land f(x) \not= c \Bigr] where$ latex c$ is a constant. The Skolemization is satisfiable if and only if the universe admits a non-surjective injection (i.e., the universe is Dedekind infinite).

Gabriel Sandu, Merlijn Sevenster, and I have recently published a book, Independence-Friendly Logic, that emphasizes the game-theoretic approach to logic, including first-order logic with imperfect information.

Dependence logic is another approach to logic with imperfect information. Instead of “slashing” each quantifier, dependencies between variables are indicated using new atomic formulas called dependence atoms. For example, the dependence atom $=(x_{1}, \dots, x_{n}, y)$ asserts that the value of the variable $y$ depends on (only) the values of $\latex x_{1}, \ldots, x_{n}$. The sentence $(*)$ can be expressed in dependence logic as

$\exists w \forall x \exists y \exists z \bigl[ z = x \land y \not= w \land =(x,y) \land =(y,z) \bigr]$ .

20 December, 2011 at 11:04 am

Allen Mann

CORRECTION. The variable $y$ in the sentence $(*)$ above should be existentially quantified:

$\exists w \forall x \bigl(\exists y / \{w\} \bigr) \bigl( \exists z/ \{w,x\} \bigr) [ z = x \land y \not= w ] \qquad (*)$

11 January, 2012 at 9:59 am

ivvrrzc

I guess, the original (*) sentence can be 1st order deciphered as follows: \forall x \exists y \forall x_0 \exists y_0 \forall x’ \exists y’ [O(x,x’,y,y’) \and Q(x_0,x’,y_0,y’)]. Of course, I agree with the aouthor that “plain old mathematical English,” i.e. expanding the language by the exactly intended two functions, is the best.

11 January, 2012 at 10:25 am

Terence Tao

Unfortunately, this sentence is not equivalent to (*). If, for instance, all variables are quantified over a set with at least three elements, and $Q(x,x',y,y')$ is the assertion that $y' \neq x$ , then (*) is false, but the statement you write is true.

11 January, 2012 at 11:03 am

Anonymous

I think, “\forall x \exists y \forall x_0 \exists y_0 \forall x’ \exists y’ [O(x,x’,y,y’) is true and is equivalent to Q(x_0,x’,y_0,y’)]” gives the correct semantics of (*).

11 January, 2012 at 11:32 am

Terence Tao

Again, the counterexample I mentioned previously (in which $Q(x,x',y,y')$ is the assertion that $y' \neq x$ ) demonstrates that this is not the case. It is clear that (*) is not true for this choice of Q (since y’, which depends only on x’, cannot avoid every value of x), but your statement is true. (For each x, choose y=x; for each x_0, choose y_0=x_0; and for each x’, choose y’ to be an element that is not equal to either x or x_0. Then Q(x,x’,y,y’) and Q(x_0,x’,y_0,y’) are both true and thus also equivalent to each other.)

12 January, 2012 at 12:07 am

Anonymous

Dear Professor Tao, first of all, thank you for your
patience and teaching me more sophistication.
Now, is writing “\forall x
\exists y \forall x’ \exists y’ \forall x_0 [O(x,x’,y,y’)
\and (O(x,x’,y,y’) \implies Q(x_0,x’,y,y’))] ”
satisfactory?

12 January, 2012 at 1:21 am

Anonymous

Sorry, or rather: “\forall x \exists y \forall x’ \exists y’ \forall x_0 \exists y_0 [O(x,x’,y,y’) \and (O(x,x’,y,y’) \implies Q(x_0,x’,y_0,y’))]. “

11 January, 2012 at 12:56 pm

Sean Eberhard

Just a little note: you parenthetically mentioned that “the only order-preserving field automorphism of the reals is the identity”. In fact you don’t need to assume order-preservation: it’s a consequence!

3 August, 2012 at 11:42 am

Nieludzka logika… « FIKSACJE

[…] Nonfirstorderizability – zjawisko polegające na tym, że pewne wyrażenia logiczne nie dają się zapisać jako zdania w logice pierwszego rzędu. Nic w tym dziwnego, takich stwierdzeń jest bardzo dużo, niektóre całkiem “proste” jak np. “przestrzeń wektorowa skończonego wymiaru”, lub “zbiór o skończone liczbie elementów”. Okazuje się, że istnieje jednak podejście nazywane Branching które ogólnie polega na porzuceniu liniowego następstwa kwantyfikatorów ogólnych ( “dla każdego” …) i wprowadzeniu dodatkowej konwencji że mogą być one używane “jednocześnie”, lub “atomowo w grupie”. Po wprowadzeniu takiej modyfikacji uzyskujemy logikę która nadal jest słabsza niz logika drugiego rzędu ( a więc nie ma kwantyfikacji po zdaniach) jednak jest mocniejsza niż logika pierwszego rzędu. W szczególności pozwala na wyrażanie stwierdzeń o prawdziwej niezależności dwu i więcej zmiennych ( proszę porównać: https://terrytao.wordpress.com/2007/08/27/printer-friendly-css-and-nonfirstorderizability/) […]

3 October, 2016 at 6:51 am

Anonymous

Dear Terry! I am certainly a latecomer to this discussion, but thank you for sharing the idea of nonfirstorderisability, it’s very interesting! :-)

Just a little question: You write:

“Moving on to a more complicated example, if Q(x,x’,y,y’) is a quaternary relation on four objects x,x’,y,y’, then we can express the statement

For every x and x’, there exists a y depending only on x and a y’ depending on x and x’ such that Q(x,x’,y,y’) is true

$\forall x \exists y \forall x' \exists y': Q(x,x',y,y')$

(note that this allows y’ to depend on y also, but this turns out to be moot, because y depends only on x)”

I don’t understand your last comment within the parantheses. Is it even possible for a variable to depend on a variable that is bound by an *existential* quantifier? I always thought that existential variables are the only type of variable that can depend on another variable and that these existential variables can only depend on universal variables. (“Existential variable” means “variable bound by an existential quantfier”, similarly “universal variable”)

3 October, 2016 at 8:14 am

Terence Tao

One can Skolemize a first-order logic sentence either by replacing the existential variables with functions of all preceding universal variables, or by replacing them with all preceding variables including the existential ones, but the latter replacement adds no additional generality since, as noted in the text, the existential variables included would in turn be dependent on prior universal variables.

12 October, 2016 at 11:56 am

Anonymous

Dear Prof. Tao,
1. I’m not familiar with the terminology “a variable depends on another variable”. Where can one read the definition of this concept? (Since you said that one uses them in Analysis: can one find them in an introduction to analysis?)

2. Also, I’m a little confused reading the thread
http://mathoverflow.net/questions/118254/usage-of-set-theory-in-undergraduate-studies/118261
where Andrej Bauer points out:
“We should never say that one variable depends on another.”
Whom should I believe?

3. When you for example say
“For every x and x’, there exists a y depending only on x and a y’ depending only on x’ such that Q(x,x’,y,y’) is true”,
do you mean the second-order sentence that says
“There is are functions f and f’ such that for any x and x’, Q(x, x’, f(x), f(x’))”
?

4. How can one prove that if a variable y depends only on a variable x, and the variable z only depends on x, that then z depends only on x?

12 October, 2016 at 12:45 pm

Terence Tao

In the strict sense, Andrej is correct: variables in a first-order sentence are not functions of other variables. However, when one converts such a first-order sentence to an equivalent (or more precisely, equisatisfiable) Skolem normal form, the existential variables become functions of preceding variables (and depending on how one carries out this form, the function either only involves preceding universal variables, or can be a function of both the preceding universal and existential variables, with the latter in turn being functions of further variables).

12 October, 2016 at 1:18 pm

Anonymous

Thanks!

What about questions 3 and 4?

28 October, 2016 at 1:54 pm

Anonymous

Dear Terry,

I think that the problem with first-order logic you describe is really Russell’s paradox in disguise. That the statement about dimensions and vector spaces is not formalizable in FOL + ZFC is due to the fact that in ZFC, one cannot speak about proper classes (and thus also not about proper class functions). But also a class theory like Morse-Kelley can’t express every mathematically meaningful statement, like statements about metaclasses (that is, arbitrary collections of classes). As Gödel pointed out in his famous paper “Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme 1”:

“The true source of the incompleteness attaching to all formal systems of mathematics, is to be found — as will be shown in Part II of this essay — in the fact that the formation of ever higher types can be continued into the transfinite (cf. D. Hilbert ‘Über das Unendliche’, Math. Ann. 95, p. 184), whereas in every formal system at most denumerably many types occur. It can be shown, that is, that the undecidable propositions here presented always become decidable by the adjunction of suitable higher types (e.g. of type ω for the system P). A similar result also holds for the axiom system of set theory.”

So one can always add another stage to the ontology of a formal system.

Also, a logic that can express dependence relations between variables has been developed by Hintikka. See https://en.wikipedia.org/wiki/Independence-friendly_logic

“Hintikka’s proposal that IF logic and its extended version be used as foundations of mathematics has been met with skepticism by other mathematicians, including Väänänen and Solomon Feferman.”

26 November, 2016 at 3:58 am

Anonymous

You say “I can’t quite show that (*) is not expressible in first-order logic”. And (*) is this statement:

For every x and x’, there exists a y depending only on x and a y’ depending only on x’ such that Q(x,x’,y,y’) is true (*)

How can one formally define/express things like “(*) is not expressible in first-order logic”?

2 January, 2021 at 9:48 am

Anonymous

This phenomenon is very well known among logicians, but I hadn’t learned about it until very recently, and had naively assumed that first order logic sufficed for “everyday” usage of mathematics.

Can one now say that second-order logic is sufficient for “everyday” usage of mathematics, particularly analysis?

Are there examples in analysis that one actually needs the power of a higher order logic (than second order)?

2 January, 2021 at 11:47 am

Terence Tao

The order of logic used in a formal argument is not actually all that good of a proxy for the “power” of that argument. Most instances of second-order (or higher-order) logic involving basic mathematical objects such as numbers, for instance, can be replicated quite faithfully as first-order logic in a set theory such as ZFC, which can encode not only numbers but sets of numbers, functions from numbers to other numbers, and so forth. And then there are arguments that go outside of a single formal logical system (first-order, second-order, or whatever) and invoke metamathematical arguments; for instance in nonstandard analysis one often appeals to the metamathematical transfer principle (Los’s theorem) to convert a first-order theorem in standard analysis to its nonstandard counterpart. Again, one can usually formalise this sort of maneuver within first-order ZFC if desired. So I would say that for analysis first-order ZFC is usually more than sufficient as a formal structure to model one’s arguments, although one is not necessarily confined to this structure if one prefers to work in some other formal structure (or to reason less formally).

Outside of analysis, one sometimes has to manipulate very large categories, in which case it is sometimes convenient to work with Grothendieck universes or similar objects which can’t quite be constructed purely within first-order ZFC. My understanding though is that in practice these universes are more convenient technicalities than essential components to the arguments.

	AI颠覆数学研究！陶哲轩借AI破解数学猜… on Formalizing the proof of PFR i…
	Anonymous on Course announcement: Math 246A…
	Anonymous on Erratum for “An inverse…
	Anonymous on 245C, Notes 3: Distributi…
	Anonymous on Analysis I
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Resumen de lecturas… on AI Mathematical Olympiad…
	Resumen de lecturas… on Two announcements: AI for Math…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on The uncertainty principle
	Anonymous on The uncertainty principle

Printer-friendly CSS, and nonfirstorderisability

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

53 comments

Leave a comment Cancel reply

For commenters

Printer-friendly CSS, and nonfirstorderisability

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

53 comments

Leave a comment Cancel reply

For commenters