You are currently browsing the monthly archive for July 2018.
About six years ago on this blog, I started thinking about trying to make a web-based game based around high-school algebra, and ended up using Scratch to write a short but playable puzzle game in which one solves linear equations for an unknown using a restricted set of moves. (At almost the same time, there were a number of more professionally made games released along similar lines, most notably Dragonbox.)
Since then, I have thought a couple times about whether there were other parts of mathematics which could be gamified in a similar fashion. Shortly after my first blog posts on this topic, I experimented with a similar gamification of Lewis Carroll’s classic list of logic puzzles, but the results were quite clunky, and I was never satisfied with the results.
Over the last few weeks I returned to this topic though, thinking in particular about how to gamify the rules of inference of propositional logic, in a manner that at least vaguely resembles how mathematicians actually go about making logical arguments (e.g., splitting into cases, arguing by contradiction, using previous result as lemmas to help with subsequent ones, and so forth). The rules of inference are a list of a dozen or so deductive rules concerning propositional sentences (things like “( AND
) OR (NOT
)”, where
are some formulas). A typical such rule is Modus Ponens: if the sentence
is known to be true, and the implication “
IMPLIES
” is also known to be true, then one can deduce that
is also true. Furthermore, in this deductive calculus it is possible to temporarily introduce some unproven statements as an assumption, only to discharge them later. In particular, we have the deduction theorem: if, after making an assumption
, one is able to derive the statement
, then one can conclude that the implication “
IMPLIES
” is true without any further assumption.
It took a while for me to come up with a workable game-like graphical interface for all of this, but I finally managed to set one up, now using Javascript instead of Scratch (which would be hopelessly inadequate for this task); indeed, part of the motivation of this project was to finally learn how to program in Javascript, which turned out to be not as formidable as I had feared (certainly having experience with other C-like languages like C++, Java, or lua, as well as some prior knowledge of HTML, was very helpful). The main code for this project is available here. Using this code, I have created an interactive textbook in the style of a computer game, which I have titled “QED”. This text contains thirty-odd exercises arranged in twelve sections that function as game “levels”, in which one has to use a given set of rules of inference, together with a given set of hypotheses, to reach a desired conclusion. The set of available rules increases as one advances through the text; in particular, each new section gives one or more rules, and additionally each exercise one solves automatically becomes a new deduction rule one can exploit in later levels, much as lemmas and propositions are used in actual mathematics to prove more difficult theorems. The text automatically tries to match available deduction rules to the sentences one clicks on or drags, to try to minimise the amount of manual input one needs to actually make a deduction.
Most of one’s proof activity takes place in a “root environment” of statements that are known to be true (under the given hypothesis), but for more advanced exercises one has to also work in sub-environments in which additional assumptions are made. I found the graphical metaphor of nested boxes to be useful to depict this tree of sub-environments, and it seems to combine well with the drag-and-drop interface.
The text also logs one’s moves in a more traditional proof format, which shows how the mechanics of the game correspond to a traditional mathematical argument. My hope is that this will give students a way to understand the underlying concept of forming a proof in a manner that is more difficult to achieve using traditional, non-interactive textbooks.
I have tried to organise the exercises in a game-like progression in which one first works with easy levels that train the player on a small number of moves, and then introduce more advanced moves one at a time. As such, the order in which the rules of inference are introduced is a little idiosyncratic. The most powerful rule (the law of the excluded middle, which is what separates classical logic from intuitionistic logic) is saved for the final section of the text.
Anyway, I am now satisfied enough with the state of the code and the interactive text that I am willing to make both available (and open source; I selected a CC-BY licence for both), and would be happy to receive feedback on any aspect of the either. In principle one could extend the game mechanics to other mathematical topics than the propositional calculus – the rules of inference for first-order logic being an obvious next candidate – but it seems to make sense to focus just on propositional logic for now.
Let be a measure-preserving system – a probability space
equipped with a measure-preserving translation
(which for simplicity of discussion we shall assume to be invertible). We will informally think of two points
in this space as being “close” if
for some
that is not too large; this allows one to distinguish between “local” structure at a point
(in which one only looks at nearby points
for moderately large
) and “global” structure (in which one looks at the entire space
). The local/global distinction is also known as the time-averaged/space-averaged distinction in ergodic theory.
A measure-preserving system is said to be ergodic if all the invariant sets are either zero measure or full measure. An equivalent form of this statement is that any measurable function which is locally essentially constant in the sense that
for
-almost every
, is necessarily globally essentially constant in the sense that there is a constant
such that
for
-almost every
. A basic consequence of ergodicity is the mean ergodic theorem: if
, then the averages
converge in
norm to the mean
. (The mean ergodic theorem also applies to other
spaces with
, though it is usually proven first in the Hilbert space
.) Informally: in ergodic systems, time averages are asymptotically equal to space averages. Specialising to the case of indicator functions, this implies in particular that
converges to
for any measurable set
.
In this short note I would like to use the mean ergodic theorem to show that ergodic systems also have the property that “somewhat locally constant” functions are necessarily “somewhat globally constant”; this is not a deep observation, and probably already in the literature, but I found it a cute statement that I had not previously seen. More precisely:
Corollary 1 Let
be an ergodic measure-preserving system, and let
be measurable. Suppose that
for some
. Then there exists a constant
such that
for
in a set of measure at least
.
Informally: if is locally constant on pairs
at least
of the time, then
is globally constant at least
of the time. Of course the claim fails if the ergodicity hypothesis is dropped, as one can simply take
to be an invariant function that is not essentially constant, such as the indicator function of an invariant set of intermediate measure. This corollary can be viewed as a manifestation of the general principle that ergodic systems have the same “global” (or “space-averaged”) behaviour as “local” (or “time-averaged”) behaviour, in contrast to non-ergodic systems in which local properties do not automatically transfer over to their global counterparts.
Proof: By composing with (say) the arctangent function, we may assume without loss of generality that
is bounded. Let
, and partition
as
, where
is the level set
For each , only finitely many of the
are non-empty. By (1), one has
Using the ergodic theorem, we conclude that
On the other hand, . Thus there exists
such that
, thus
By the Bolzano-Weierstrass theorem, we may pass to a subsequence where converges to a limit
, then we have
for infinitely many , and hence
The claim follows.
Let ,
be additive groups (i.e., groups with an abelian addition group law). A map
is a homomorphism if one has
for all . A map
is an affine homomorphism if one has
for all additive quadruples in
, by which we mean that
and
. The two notions are closely related; it is easy to verify that
is an affine homomorphism if and only if
is the sum of a homomorphism and a constant.
Now suppose that also has a translation-invariant metric
. A map
is said to be a quasimorphism if one has
for all , where
denotes a quantity at a bounded distance from the origin. Similarly,
is an affine quasimorphism if
for all additive quadruples in
. Again, one can check that
is an affine quasimorphism if and only if it is the sum of a quasimorphism and a constant (with the implied constant of the quasimorphism controlled by the implied constant of the affine quasimorphism). (Since every constant is itself a quasimorphism, it is in fact the case that affine quasimorphisms are quasimorphisms, but now the implied constant in the latter is not controlled by the implied constant of the former.)
“Trivial” examples of quasimorphisms include the sum of a homomorphism and a bounded function. Are there others? In some cases, the answer is no. For instance, suppose we have a quasimorphism . Iterating (2), we see that
for any integer
and natural number
, which we can rewrite as
for non-zero
. Also,
is Lipschitz. Sending
, we can verify that
is a Cauchy sequence as
and thus tends to some limit
; we have
for
, hence
for positive
, and then one can use (2) one last time to obtain
for all
. Thus
is the sum of the homomorphism
and a bounded sequence.
In general, one can phrase this problem in the language of group cohomology (discussed in this previous post). Call a map a
-cocycle. A
-cocycle is a map
obeying the identity
for all . Given a
-cocycle
, one can form its derivative
by the formula
Such functions are called -coboundaries. It is easy to see that the abelian group of
-coboundaries is a subgroup of the abelian group of
-cocycles. The quotient of these two groups is the first group cohomology of
with coefficients in
, and is denoted
.
If a -cocycle is bounded then its derivative is a bounded
-coboundary. The quotient of the group of bounded
-cocycles by the derivatives of bounded
-cocycles is called the bounded first group cohomology of
with coefficients in
, and is denoted
. There is an obvious homomorphism
from
to
, formed by taking a coset of the space of derivatives of bounded
-cocycles, and enlarging it to a coset of the space of
-coboundaries. By chasing all the definitions, we see that all quasimorphism from
to
are the sum of a homomorphism and a bounded function if and only if this homomorphism
is injective; in fact the quotient of the space of quasimorphisms by the sum of homomorphisms and bounded functions is isomorphic to the kernel of
.
In additive combinatorics, one is often working with functions which only have additive structure a fraction of the time, thus for instance (1) or (3) might only hold “ of the time”. This makes it somewhat difficult to directly interpret the situation in terms of group cohomology. However, thanks to tools such as the Balog-Szemerédi-Gowers lemma, one can upgrade this sort of
-structure to
-structure – at the cost of restricting the domain to a smaller set. Here I record one such instance of this phenomenon, thus giving a tentative link between additive combinatorics and group cohomology. (I thank Yuval Wigderson for suggesting the problem of locating such a link.)
Theorem 1 Let
,
be additive groups with
, let
be a subset of
, let
, and let
be a function such that
for
additive quadruples
in
. Then there exists a subset
of
containing
with
, a subset
of
with
, and a function
such that
for all
(thus, the derivative
takes values in
on
), and such that for each
, one has
for
values of
.
Presumably the constants and
can be improved further, but we have not attempted to optimise these constants. We chose
as the domain on which one has a bounded derivative, as one can use the Bogulybov lemma (see e.g, Proposition 4.39 of my book with Van Vu) to find a large Bohr set inside
. In applications, the set
need not have bounded size, or even bounded doubling; for instance, in the inverse
theory over a small finite fields
, one would be interested in the situation where
is the group of
matrices with coefficients in
(for some large
, and
being the subset consisting of those matrices of rank bounded by some bound
.
Proof: By hypothesis, there are triples
such that
and
Thus, there is a set with
such that for all
, one has (6) for
pairs
with
; in particular, there exists
such that (6) holds for
values of
. Setting
, we conclude that for each
, one has
for values of
.
Consider the bipartite graph whose vertex sets are two copies of , and
and
connected by a (directed) edge if
and (7) holds. Then this graph has
edges. Applying (a slight modification of) the Balog-Szemerédi-Gowers theorem (for instance by modifying the proof of Corollary 5.19 of my book with Van Vu), we can then find a subset
of
with
with the property that for any
, there exist
triples
such that the edges
all lie in this bipartite graph. This implies that, for all
, there exist
septuples
obeying the constraints
and for
. These constraints imply in particular that
Also observe that
Thus, if and
are such that
, we see that
for octuples
in the hyperplane
By the pigeonhole principle, this implies that for any fixed , there can be at most
sets of the form
with
,
that are pairwise disjoint. Using a greedy algorithm, we conclude that there is a set
of cardinality
, such that each set
with
,
intersects
for some
, or in other words that
whenever . In particular,
This implies that there exists a subset of
with
, and an element
for each
, such that
for all . Note we may assume without loss of generality that
and
.
By construction of , and permuting labels, we can find
16-tuples
such that
and
for . We sum this to obtain
and hence by (8)
where . Since
we see that there are only possible values of
. By the pigeonhole principle, we conclude that at most
of the sets
can be disjoint. Arguing as before, we conclude that there exists a set
of cardinality
such that
whenever (10) holds.
For any , write
arbitrarily as
for some
(with
if
, and
if
) and then set
Then from (11) we have (4). For we have
, and (5) then follows from (9).
Recent Comments