I recently discovered a CSS hack which automatically makes wordpress pages friendlier to print (stripping out the sidebar, header, footer, and comments), and installed it on this blog (in response to an emailed suggestion). There should be no visible changes unless one “print previews” the page.
In order to prevent this post from being totally devoid of mathematical content, I’ll mention that I recently came across the phenomenon of nonfirstorderisability in mathematical logic: there are perfectly meaningful and useful statements in mathematics which cannot be phrased within the confines of first order logic (combined with the language of set theory, or any other standard mathematical theory); one must use a more powerful language such as second order logic instead. This phenomenon is very well known among logicians, but I hadn’t learned about it until very recently, and had naively assumed that first order logic sufficed for “everyday” usage of mathematics.
Let’s begin with some simple examples of statements which can be expressed in first-order logic. If B(x,y) is a binary relation on two objects x, y, then we can express the statement
For every x, there exists a y depending on x such that B(x,y) is true
in first order logic as
whereas the statement
For every x, there exists a y independent of x such that B(x,y) is true
can be expressed as
Moving on to a more complicated example, if Q(x,x’,y,y’) is a quaternary relation on four objects x,x’,y,y’, then we can express the statement
For every x and x’, there exists a y depending only on x and a y’ depending on x and x’ such that Q(x,x’,y,y’) is true
as
(note that this allows y’ to depend on y also, but this turns out to be moot, because y depends only on x), and one can similarly express
For every x and x’, there exists a y depending on x and x’ and a y’ depending only on x’ such that Q(x,x’,y,y’) is true
as
but it seems that one cannot express
For every x and x’, there exists a y depending only on x and a y’ depending only on x’ such that Q(x,x’,y,y’) is true (*)
in first order logic! For instance, the statement
To every finitely generated real vector space V one can associate a unique non-negative integer
such that
- V, W are isomorphic if and only if
;
- an injection from V to W exists if and only if
;
- a surjection from V to W exists if and only if
;
; and
for all V, W,
which is part of the fundamental theorem of linear algebra, does not seem to be expressible as stated in first order set theory (though of course the concept of dimension can be explicitly constructed within this language), even if we drop the uniqueness and restrict ourselves to just the assertion that obey, say, property 1, so that we get an assertion of the form (*). Note that the category of all finite-dimensional vector spaces is not a set (for reasons relating to Russell’s paradox) and so we cannot view
as a function. More generally, many statements in category theory dealing with large categories seem to not be expressible in first order logic.
I can’t quite show that (*) is not expressible in first-order logic, but I can come very close, using non-standard analysis. The statement
For every real numbers x and x’ there exists real numbers st(x) and st(x’) depending only on x and x’ respectively, such that st(x+x’) = st(x)+st(x’), st(xx’) = st(x) st(x’), st(1)=1, and st(x) is non-negative whenever x is non-negative, and also such that st(x) is not always equal to x
is true in the non-standard model of the real numbers, but false in the standard model (this is the classic algebra homework problem that the only order-preserving field homomorphism on the reals is the identity). Since the transfer principle ensures that all first-order statements that are true in the standard reals are also true in the non-standard reals, this means that the above statement cannot be expressed in first-order logic. If it weren’t for the “st(x) is not always equal to x” part, this would basically be of the form (*).
It seems to me that first order logic is limited by the linear (and thus totally ordered) nature of its sentences; every new variable that is introduced must be allowed to depend on all the previous variables introduced to the left of that variable. This does not fully capture all of the dependency trees of variables which one deals with in mathematics. In analysis, we tend to get around this by using English phrasings such as
… assuming N is chosen sufficiently large depending on
, and
chosen sufficiently small depending on N …
and
… where C can depend on k and d, but is uniform with respect to n and f …,
or by using the tremendously convenient O() and o() notation of Landau. One then takes for granted that one can eventually unwind all these phrasings to get back to a sentence in formal, first-order logic. As far as analysis is concerned, this is a fairly safe assumption, since one usually deals with objects in very concrete sets such as the real numbers, and one can easily model all of these dependencies using functions from concrete sets to other concrete sets if necessary. (Also, the hierarchy of magnitudes in analysis does often tend to be rather linearly ordered.) But some subtleties may appear when one deals with large categories, such as the category of sets, groups, or vector spaces (though in most applications, one can cap the cardinality of these objects and then one can represent these categories up to equivalence by an actual set). It may be that a more diagrammatic language (perhaps analogous to the commutative diagrams in category theory, or one based on trees or partially ordered sets rather than linearly ordered ones) may be a closer fit to expressing the way one actually thinks about how variables interact with each other. (Second-order logic is, of course, an obvious candidate for such a language, but it may be overqualified for the task. And, in practice, there’s nothing wrong with just using plain old mathematical English.)
[Update, Aug 27: bad link fixed.]

28 comments
Comments feed for this article
27 August, 2007 at 8:36 am
Suresh Venkat
Hi,
It’s worth mentioning here that first and second order logic are closely related to complexity classes, via an area of complexity theory called descriptive complexity. A famous result proved by Ron Fagin showed that NP is precisely the class of languages expressible by predicates in existential second order logic, a result that is very intriguing in how it connects “syntax” and “algorithmic complexity”. Other results include the characterization of P by first order logic + a fixed point operator + an ordering operator, and results for lower complexity classes, many of these results shown by Neil Immerman.
http://en.wikipedia.org/wiki/Descriptive_complexity
27 August, 2007 at 10:08 am
Oleg Izhvanov
Hi,
A technical note: the last link in the post is broken.
“And, in practice, there’s nothing wrong with
just using plain old mathematical English”.
27 August, 2007 at 1:14 pm
Terence Tao
Dear Suresh and Oleg,
Thanks for the references and corrections!
28 August, 2007 at 12:26 am
Ori Gurel-Gurevich
“Note that the category of all finite-dimensional vector spaces is not a set (for reasons relating to Russell’s paradox) and so we cannot view \dim as a function. More generally, many statements in category theory dealing with large categories seem to not be expressible in first order logic.”
That is true if we restrict ourselves to ZFC. However, an extended set theory like NBG is more suitable for such things. This would allow one to talk about such class functions as
without resorting to second order theories. There’s also the additional benefit of NBG being finitely axiomatizable.
28 August, 2007 at 9:26 am
Andy D
I believe I can show that the statement (*) Terry was discussing is not FO-definable for a particular FO-definable quaternary relation Q(x, x’, y, y’).
Namely, let Q(x, x’, y, y’) be
(y != x) and [(x = x') implies (y = y')] and [ (x != x') implies (y != y')].
Then (somebody please check this) statement (*) is true iff there exists a perfect matching over the universe’s elements.
Any infinite universe has such a matching; so the negation of (*) is true iff the universe is finite and has an odd number of elements (and the class of FO-definable properties is closed under negation). But universe parity is known not to be FO-definable.
FO-inexpressibility results are to my knowledge generally shown by one of two approaches:
i) analysis of the associated ‘Ehrenfeucht-Fraisse-Game’, an approach which can show inexpressibility of parity;
ii) reduction to a known FO-inexpressible property, as above. The most common reduction targets seem to be parity/counting properties, and graph connectivity.
28 August, 2007 at 10:34 am
Kenny
I’d heard of branching quantifiers before, but didn’t know what they were good for (except that some linguists, like Hintikka, have suggested that they might be good for analyzing natural language). It’s good to have some examples now!
28 August, 2007 at 12:47 pm
David Corfield
As Kenny’s link mentions, branching quantifiers have been treated by Hintikka in his ‘independence-friendly’ logic. The formula with four variable mentioned corresponds to a 2 vs 2 player game with incomplete information. Samson Abramsky, the theoretical computer scientist, has done some interesting work looking into the syntax and semantics of logics corresponding to multi-player games, with one ulterior aim being the modelling of concurrency.
28 August, 2007 at 1:52 pm
Andy D
OK, looks like what I wrote is subsumed by the results mentioned on Q_H in Kenny’s reference. (*) seems like a surprisingly powerful schema.
28 August, 2007 at 10:43 pm
Kevin O'Bryant
Here’s another example dear to your heart. For each positive integer k, the statement, “the set A contains an arithmetic progression of length k” is transparently FO, while the statement “for each positive integer k, the set A contains an arithmetic progression of length k” is not (at least not in any straightforward manner).
A famously non-FO statement is “Every set with an upper bound has a least upper bound”, which is true for the reals, but not true for the non-standard reals, and so cannot be FO. This is, it seems to me, part of the undertapped mojo of nonstandard analysis (beyond epsilon management). You get access to different second order theorems (you lose a few but also gain a few) to help you prove first order theorems.
29 August, 2007 at 6:03 am
Terence Tao
Dear Kevin,
I’m not sure I understand your examples, at least if I am interpreting “set” in an internal sense. Wouldn’t
be a first-order sentence that asserts that A contains arbitrarily long progressions? Similarly, the statement that every set with an upper bound has a least upper bound is still true for the non-standard reals if one restricts attention to non-standard sets of reals (as opposed to sets of non-standard reals). If one interprets the concept of “set” internally then
seems to be a first-order sentence (albeit a rather ugly one) that asserts that every (non-empty) set with an upper bound has a least upper bound.
29 August, 2007 at 11:32 am
Kenny
I think in your first example you mean
rather than
.
As for the second example, it’s always sort of sneaky to quantify over sets in a first-order way – when you do that, you get non-standard models where the objects are just fine, but you’re missing some of the sets or something. I know people working on reverse mathematics (like Harvey Friedman and Steven Simpson at various points) do that with models of “second-order” arithmetic (really just first-order arithmetic where they use first-order quantification over sets as well) to prove that certain standard theorems are independent of weakened axiom systems.
29 August, 2007 at 2:30 pm
Scott McKuen
Terry’s first example (mod Kenny) looks good if we’re willing to accept infinitely-long arithmetric progressions in the nonstandard models (since the variable k then runs over nonstandard positive integers as well). If we just want arbitrarily-long-but-finite progressions, which I think is what Kevin intended, then it fails.
If we’re working outside the theory of arithmetic, say in fields, and we don’t have a predicate to designate the integers as a definable set, then it doesn’t work.
29 August, 2007 at 11:12 pm
Emmanuel Kowalski
Restricting the “domain” of quantification (I’m not sure which is the
with
, it has been shown by Chatzidakis, van den Dries
(i.e., the
in
for which the formula is “true”,
only) is always (either 0 or)
technically correct word in logic/model theory) may make a huge
difference in what sets first order theory can define. One very cute
example, which does not seem to be so well-known, considers definable
sets in the first order theory of rings (+, -, *, 0, 1, …) applied
to finite fields: if we have a formula
and Macintyre (J. Reine angew. Math. 427 (1992), 107–135) that the
number of “solutions” in a finite field
number of
with the standard interpretation of the language, and all quantifiers
implicitly running over
of the type
where
is a rational number (which may depend on
, but can only
), and
is an
take finitely many values for a given
integer with the same property. This is of course reminiscent (and
ultimately derived from) the Lang-Weil estimate for the number of points
on an algebraic variety over a finite field.
So in particular the subfield
of
is
, for which the set of “solutions” in
is
for infinitely many
. (Come to
not definable in the first order theory of rings, i.e., there is no
formula
think of it, I’m not sure how using second order theory of rings could
help here; of course, adding something like the Frobenius automorphism
in the language would solve the problem).
(Note: one can even add
exponential sums into the mixture and show cancellation statements
in exponential sums over such definable sets, similar to the results
of Weil and Deligne on exponential sums over finite fields — at least
the simplest statements extend –; in particular, it follows that an
“interval” (or an arithmetic progression) in
is also not definable in this language (unless it’s a fixed number of
elements, or the complement of such a thing). This is not surprising,
but for intervals with positive density one needs more than the result
of C-vdD-M — namely some additive characters do not cancel
appreciably over a long interval/arithmetic progression).
30 August, 2007 at 7:29 pm
Anonymous
I’m confused about how the 2×2 branching quantifier
is supposed to support the “there are infinitely many” quantifier. Here’ s the quantifier-free clause in question for a generic formula (phi):
The three parts appear to say
1) something satisfies the formula;
2) if you have one satisfier, you can get another that isn’t just the starting point (no loops that start at “a”);
3) the map from one satisfier to the next is a bijection (no loops can start later on, either).
Why does quantifying this way:
force “there are infinitely many” but quantifying, say, like this:
does not? Can we exhibit a finite model of the first-order version?
30 August, 2007 at 10:05 pm
Terence Tao
Dear anonymous,
I’m assuming that your final display was of a standard first-order sentence rather than a branching quantifier one, despite the line break. In that case, the problem with that sentence is that
is allowed to depend on
as well as
; once you allow that, the clause
is very easy to satisfy even for a model in which
is only satisfied at two values, a and b; one just sets
to always equal b, and
to equal
when
, and a otherwise.
It seems that the branching quantifier offers a way to define the concept of a class function
in a manner which can’t be replicated in first order logic, because one needs to assert that equal inputs force equal outputs, and you can’t do this if one of the outputs is allowed to depend on the other input.
30 August, 2007 at 11:17 pm
Scott McKuen
Hi, Terry,
That was my anonymous bad LaTeX above – yes, no line break intended. Thanks for the example – now it’s clear why the bijection isn’t enforced. Rewriting the thing with Skolem functions helped, too.
31 August, 2007 at 6:53 am
Walt
While I find non-firstorderability very interesting, I think it’s the norm rather than the exception, which is why we use set theory rather than first-order models for everything. For example, first-order logic can’t express the Archimedean axiom, or the Riemann hypothesis. Or am I misunderstanding?
31 August, 2007 at 8:05 am
Terence Tao
Dear Walt,
If you combine first-order logic with a standard set theory such as ZFC, then I agree that you can describe, say, Riemann hypothesis (though I won’t try to write down the actual sentences here, they would be atrociously long), because one can use set theory to model objects such as the zeta function. (The Archimedean axiom seems to me to be expressible in the first-order language of the integers and reals without any need for set theory.) What was surprising, to me at least, was that even with ZFC, statements such as the existence of an integer dimension for each finitely generated vector space do not seem to be easily firstorderisable; set theory doesn’t directly help here because it cannot describe class functions such as
unless one performs some trickery to reduce the size of the category of vector spaces to the point where it can be modeled by a set rather than a class.
As Ori pointed out though, if one uses von Neumann-Bernays-Godel theory instead of ZFC, one can deal with classes and class functions more easily and many of these issues seem to disappear. Since NBG and ZFC are equivalent when restricted to sentences that refer only to sets, the nonfirstorderisability issue is mostly moot, but it still is more subtle than one might naively first imagine.
31 August, 2007 at 10:34 am
Walt
I see what you’re getting at. I think the ZFC-ists answer would be to translate it into a collection of statements, parametrized by ordinals, so that for each ordinal a you can state and prove that dim exists for the set V_a in the von Neumann heirarchy. (That’s probably what you mean by “some trickery”.)
31 August, 2007 at 1:48 pm
Walt
Actually, I think you can basically formalize this in ZFC. You can’t formalize your statement exactly, of course, since it involves branching quantifiers, but you can formalize a statement in ZFC that implies yours. I don’t know much about branching quantifiers, though, so I could be getting this totally wrong.
Let P(V,n) be the propositional formula that specifies that V is a finite-dimensional vector space, and n is its unique dimension. Then you should be able to prove a statement like: (all V)(exists n)P(V,n) and ( (all V)(exists n)P(V,n) implies Q(V,V’,n,n’) ), where Q is one of your theorems about vector spaces.
I think you can show this will imply your branching quantifier statement by Skolemization. P(V,n) turns into P(V, n(V)), Q(V,V’,n,n’) would turn into Q(V, V’, n(V, V’), n’(V, V’)), but you can prove by the uniqueness of dimension that n(V,V’) = n(V) and n’(V, V’) = n’(V’). (Though as I said, I don’t know much about branching quantifiers, so this may be more of a parade of my ignorance than anything else.)
1 September, 2007 at 2:10 pm
Kevin O'Bryant
Regarding the first-orderizability of containing arbitrarily long arithmetic progressions…
Terry’s sentence

is inherently referring to the natural numbers, while the other quantifiers
are referring to elements of the model. If the only models we’re interested in contain the ring of integers (so that
makes the usual sense), then it’s fine.
only captures what we mean by arithmetic progressions in some settings. The first quantifier
The problem with this arises because we talk about arithmetic progressions in general abelian groups. In a group the statement
does not make sense. For
, for example, we can replace
with
, and so for each particular
we can make it make sense. But to make it make sense for all
, we need to quantify over the natural numbers. By definition “first order” means that we quantify only over elements of the model.
For example, we all concur precisely on the meaning of “arithmetic progression” in the group of rational points on an elliptic curve. The set
contains
terms in arithmetic progression (for a natural number k) if
. For any particular natural number
, those dots represent finite strings. But without the natural numbers intrinisically available to us, we cannot quantify over
.
1 September, 2007 at 7:58 pm
Anonymous
Do we ever talk about axiom systems that are modelled by elliptic curves?! I thought an elliptic curve was just another set sitting in ZFC or NBG or NF or whatever. Therefore, normally when we speak of elliptic curves, they are inside our set theory, and the integers are available to us.
2 September, 2007 at 1:45 am
Emmanuel Kowalski
Working with axioms systems like “first order theory of [something]s”,
, but not the
where [something] could be groups, rings, elliptic curves, etc, is
definitely something important. I’m not sure about elliptic curves,
but the first-order theory of rings (and fields, which can be defined
in this language) has many applications in algebra and number theory.
The fact that some sets can be defined in such a way can have
important consequences for applications. This can be thought-of (maybe) as an
analogue of “smoothness” for functions: not all functions are
smooth, but if you know that one is, it can be very useful! E.g., the
unit group of a ring can be defined by
torsion subgroup of the unit group, because there is no a priori upper
bound on the order of a torsion unit — note of course if we can
invoke integers, the torsion subgroup is easy to define…
As an example of application, using the results of Chatzidakis, van
den Dries and Macintyre I mentioned in my previous comment on
definable sets in finite fields (and other logical things I don’t
understand, including ultrafilters), Hrushovsky and Pillay managed to
prove the following theorem (Crelle 462, 1995, p. 69–91, Prop. 7.3):
“Let
be any Zariski-dense finitely generated subgroup of a
defined over
, for instance
, or
or
; then
, the reduction map
is surjective.”
(almost simple, simply-connected) algebraic group
for almost all primes
Notice the statement is purely algebraic and very natural (and
, this is not so obvious to prove by hand, and for a
beautiful). In fact it was proved earlier by Matthews, Vaserstein and
Weisfeiler, who used the classification of finite simple groups! Even
for
Zariski-dense subgroup, which can be a very thin subset, this is
really deep.
4 September, 2007 at 5:56 am
Logic games « Forking, Forcing and Back-and-Forthing
[...] a recent post, Terence Tao evokes statements that cannot be formalized in first-order logic. David Corfield has [...]
4 September, 2007 at 12:33 pm
Terence Tao
Dear Walt,
Your statement concerning dimension is a first order one as long as you specify what your predicate P(V,n) is – i.e. you give an explicit definition of the concept of dimension. If you don’t actually want to write down this definition, and instead want to assert the abstract existence of a concept of dimension which satisfies various properties, then you need to pass to second-order logic (as you will need a second-order quantifier such as
).
Dear Kevin,
Thanks for the clarification. I guess the moral is that statements involving integers can require the integers to be part of the language in order to express them in first-order logic; similarly statements involving sets or functions may require set theory to be firstorderisable; and statements involving class functions may require a language that can express the concept of a class, such as NBG.
6 September, 2007 at 7:47 am
Walt
The definition of firstorderizability, as far as I understand it, is that a sentence in a theory is firstorderizable if its truth value (as you range over models of the theory) is the same as a first-order sentence. While your sentence requires second-order logic to express, its truth value over models of ZFC is the same as the sentence using the explicit definition. In that sense, your sentence is firstorderizable.
6 September, 2007 at 8:30 am
Terence Tao
Dear Walt,
Technically you are correct, but by this definition any statement that one can actually prove in ZFC becomes firstorderizable for trivial reasons, because it has the same truth value as a tautology such as 0=0. But one can easily tweak the sentence to avoid this, by replacing the question “do vector spaces have a unique notion of dimension?” to “does a [insert vector-space like object here] have a unique notion of dimension?”, where we view the concept of a vector-space like object as a primitive. For instance, one could imagine that one has a class of vector-space-like objects which form an abelian category with a distinguished “one-dimensional” object (the analogue of
in the actual vector space example), and one can ask whether this class has a notion of dimension which obeys all the properties listed above (replacing surjections by epimorphisms, replacing
by the distinguished one-dimensional object, and so forth). If we model this class of objects by a unary predicate (e.g. Vec(V) is the statement that V is a vector-space-like object), then the question “Does the class Vec have a unique notion of dimension?” becomes a statement which I believe is not firstorderisable in ZFC, though I might be wrong on this if there is a sufficiently canonical way to define dimension which works for all abelian categories that admit such a notion.
6 September, 2007 at 11:59 am
Walt
I’m only trying to be technically correct. :-)