You are currently browsing Terence Tao's articles.
Let be a finite subset of a non-commutative group
. As mentioned previously on this blog (as well as in the current logic reading seminar), there is some interest in classifying those
which obey small doubling conditions such as
or
. A full classification here has still not been established. However, I wanted to record here an elementary argument (based on Exercise 2.6.5 of my book with Van Vu, which in turn is based on this paper of Izabella Laba) that handles the case when
is very close to
:
Proposition 1 If
, then
and
are both finite groups, which are conjugate to each other. In particular,
is contained in the right-coset (or left-coset) of a group of order less than
.
Remark 1 The constant
is completely sharp; consider the case when
where
is the identity and
is an element of order larger than
. This is a small example, but one can make it as large as one pleases by taking the direct product of
and
with any finite group. In the converse direction, we see that whenever
is contained in the right-coset
(resp. left-coset
) of a group of order less than
, then
(resp.
) is necessarily equal to all of
, by the inclusion-exclusion principle (see the proof below for a related argument).
Proof: We begin by showing that is a group. As
is symmetric and contains the identity, it suffices to show that this set is closed under addition.
Let . Then we can write
and
for
. If
were equal to
, then
and we would be done. Of course, there is no reason why
should equal
; but we can use the hypothesis
to boost this as follows. Observe that
and
both have cardinality
and lie inside
, which has cardinality strictly less than
. By the inclusion-exclusion principle, this forces
to have cardinality greater than
. In other words, there exist more than
pairs
such that
, which implies that
. Thus there are more than
elements
such that
for some
(since
is uniquely determined by
); similarly, there exists more than
elements
such that
for some
. Again by inclusion-exclusion, we can thus find
in
for which one has simultaneous representations
and
, and so
, and the claim follows.
In the course of the above argument we showed that every element of the group has more than
representations of the form
for
. But there are only
pairs
available, and thus
.
Now let be any element of
. Since
, we have
, and so
. Conversely, every element of
has exactly
representations of the form
where
. Since
occupies more than half of
, we thus see from the inclusion-exclusion principle, there is thus at least one representation
for which
both lie in
. In other words,
, and the claim follows.
To relate this to the classical doubling constants , we first make an easy observation:
Again, this is sharp; consider equal to
where
generate a free group.
Proof: Suppose that is an element of
for some
. Then the sets
and
have cardinality
and lie in
, so by the inclusion-exclusion principle, the two sets intersect. Thus there exist
such that
, thus
. This shows that
is contained in
. The converse inclusion is proven similarly.
Proposition 3 If
, then
is a finite group of order
, and
for some
in the normaliser of
.
The factor is sharp, by the same example used to show sharpness of Proposition 1. However, there seems to be some room for further improvement if one weakens the conclusion a bit; see below the fold.
Proof: Let (the two sets being equal by Lemma 2). By the argument used to prove Lemma 2, every element of
has more than
representations of the form
for
. By the argument used to prove Proposition 1, this shows that
is a group; also, since there are only
pairs
, we also see that
.
Pick any ; then
, and so
. Because every element of
has
representations of the form
with
,
, and
occupies more than half of
and of
, we conclude that each element of
lies in
, and so
and
.
The intersection of the groups and
contains
, which is more than half the size of
, and so we must have
, i.e.
normalises
, and the proposition follows.
Because the arguments here are so elementary, they extend easily to the infinitary setting in which is now an infinite set, but has finite measure with respect to some translation-invariant Kiesler measure
. We omit the details. (I am hoping that this observation may help simplify some of the theory in that setting.)
This week, Henry Towsner concluded his portion of reading seminar of the Hrushovski paper, by discussing (a weaker, simplified version of) main model-theoretic theorem (Theorem 3.4 of Hrushovski), and described how this theorem implied the combinatorial application in Corollary 1.2 of Hrushovski. The presentation here differs slightly from that in Hrushovski’s paper, for instance by avoiding mention of the more general notions of S1 ideals and forking.
Here is a collection of resources so far on the Hrushovski paper:
- Henry Towsner’s notes (which most of Notes 2-4 have been based on);
- Alex Usvyatsov’s notes on the derivation of Corollary 1.2 (broadly parallel to the notes here);
- Lou van den Dries’ notes (covering most of what we have done so far, and also material on stable theories); and
- Anand Pillay’s sketch of a simplified proof of Theorem 1.1.
A fundamental tool in any mathematician’s toolkit is that of reductio ad absurdum: showing that a statement is false by assuming first that
is true, and showing that this leads to a logical contradiction. A particulary pure example of reductio ad absurdum occurs when establishing the non-existence of a hypothetically overpowered object or structure
, by showing that
’s powers are “self-defeating”: the very existence of
and its powers can be used (by some clever trick) to construct a counterexample to that power. Perhaps the most well-known example of a self-defeating object comes from the omnipotence paradox in philosophy (“Can an omnipotent being create a rock so heavy that He cannot lift it?”); more generally, a large number of other paradoxes in logic or philosophy can be reinterpreted as a proof that a certain overpowered object or structure does not exist.
In mathematics, perhaps the first example of a self-defeating object one encounters is that of a largest natural number:
Proposition 1 (No largest natural number) There does not exist a natural number
which is larger than all other natural numbers.
Proof: Suppose for contradiction that there was such a largest natural number . Then
is also a natural number which is strictly larger than
, contradicting the hypothesis that
is the largest natural number.
Note the argument does not apply to the extended natural number system in which one adjoins an additional object beyond the natural numbers, because
is defined equal to
. However, the above argument does show that the existence of a largest number is not compatible with the Peano axioms.
This argument, by the way, is perhaps the only mathematical argument I know of which is routinely taught to primary school children by other primary school children, thanks to the schoolyard game of naming the largest number. It is arguably one’s first exposure to a mathematical non-existence result, which seems innocuous at first but can be surprisingly deep, as such results preclude in advance all future attempts to establish existence of that object, no matter how much effort or ingenuity is poured into this task. One sees this in a typical instance of the above schoolyard game; one player tries furiously to cleverly construct some impressively huge number , but no matter how much effort is expended in doing so, the player is defeated by the simple response “… plus one!” (unless, of course,
is infinite, ill-defined, or otherwise not a natural number).
It is not only individual objects (such as natural numbers) which can be self-defeating; structures (such as orderings or enumerations) can also be self-defeating. (In modern set theory, one considers structures to themselves be a kind of object, and so the distinction between the two concepts is often blurred.) Here is one example (related to, but subtly different from, the previous one):
Proposition 2 (The natural numbers cannot be finitely enumerated) The natural numbers
cannot be written as
for any finite collection
of natural numbers.
Proof: Suppose for contradiction that such an enumeration existed. Then consider the number
; this is a natural number, but is larger than (and hence not equal to) any of the natural numbers
, contradicting the hypothesis that
is enumerated by
.
Here it is the enumeration which is self-defeating, rather than any individual natural number such as or
. (For this post, we allow enumerations to contain repetitions.)
The above argument may seem trivial, but a slight modification of it already gives an important result, namely Euclid’s theorem:
Proposition 3 (The primes cannot be finitely enumerated) The prime numbers
cannot be written as
for any finite collection of prime numbers.
Proof: Suppose for contradiction that such an enumeration existed. Then consider the natural number
; this is a natural number larger than
which is not divisible by any of the primes
. But, by the fundamental theorem of arithmetic (or by the method of Infinite descent, which is another classic application of reductio ad absurdum), every natural number larger than
must be divisible by some prime, contradicting the hypothesis that
is enumerated by
.
Remark 1 Continuing the number-theoretic theme, the “dueling conspiracies” arguments in a previous blog post can also be viewed as an instance of this type of “no-self-defeating-object”; for instance, a zero of the Riemann zeta function at
implies that the primes anti-correlate almost completely with the multiplicative function
, which is self-defeating because it also implies complete anti-correlation with
, and the two are incompatible. Thus we see that the prime number theorem (a much stronger version of Proposition 3) also emerges from this type of argument.
In this post I would like to collect several other well-known examples of this type of “no self-defeating object” argument. Each of these is well studied, and probably quite familiar to many of you, but I feel that by collecting them all in one place, the commonality of theme between these arguments becomes more apparent. (For instance, as we shall see, many well-known “paradoxes” in logic and philosophy can be interpreted mathematically as a rigorous “no self-defeating object” argument.)
As the previous discussion on displaying mathematics on the web has become quite lengthy, I am opening a fresh post to continue the topic. I’m leaving the previous thread open for those who wish to respond directly to some specific comments in that thread, but otherwise it would be preferable to start afresh on this thread to make it easier to follow the discussion.
It’s not easy to summarise the discussion so far, but the comments have identified several existing formats for displaying (and marking up) mathematics on the web (mathML, jsMath, MathJax, OpenMath), as well as a surprisingly large number of tools for converting mathematics into web friendly formats (e.g. LaTeX2HTML, LaTeXMathML, LaTeX2WP, Windows 7 Math Input, itex2MML, Ritex, Gellmu, mathTeX, WP-LaTeX, TeX4ht, blahtex, plastex, TtH, WebEQ, techexplorer, etc.). Some of the formats are not widely supported by current software, and by current browsers in particular, but it seems that the situation will improve with the next generation of these browsers.
It seems that the tools that already exist are enough to improvise a passable way of displaying mathematics in various formats online, though there are still significant issues with accessibility, browser support, and ease of use. Even if all these issues are resolved, though, I still feel that something is still missing. Currently, if I want to transfer some mathematical content from one location to another (e.g. from a LaTeX file to a blog, or from a wiki to a PDF, or from email to an online document, or whatever), or to input some new piece of mathematics, I have to think about exactly what format I need for the task at hand, and what conversion tool may be needed. In contrast, if one looks at non-mathematical content such as text, links, fonts, non-Latin alphabets, colours, tables, images, or even video, the formats here have been standardised, and one can manipulate this type of content in both online and offline formats more or less seamlessly (in principle, at least – there is still room for improvement), without the need for any particularly advanced technical expertise. It doesn’t look like we’re anywhere near that level currently with regards to mathematical content, though presumably things will improve when a single mathematics presentation standard, such as mathML, becomes universally adopted and supported in browsers, in operating systems, and in other various pieces of auxiliary software.
Anyway, it has been a very interesting and educational discussion for me, and hopefully for others also; I look forward to any further thoughts that readers have on these topics. (Also, feel free to recapitulate some of the points from the previous thread; the discussion has been far too multifaceted for me to attempt a coherent summary by myself.)
This week, Henry Towsner continued some model-theoretic preliminaries for the reading seminar of the Hrushovski paper, particularly regarding the behaviour of wide types, leading up to the main model-theoretic theorem (Theorem 3.4 of Hrushovski) which in turn implies the various combinatorial applications (such as Corollary 1.2 of Hrushovski). Henry’s notes can be found here.
A key theme here is the phenomenon that any pair of large sets contained inside a definable set of finite measure (such as ) must intersect if they are sufficiently “generic”; the notion of a wide type is designed, in part, to capture this notion of genericity.
The various languages and formats that make up modern web pages (HTML, XHTML, CSS, etc.) work wonderfully for most purposes, but there is one place where they are still somewhat clunky, namely in the presentation of mathematical equations and diagrams on web pages. While web formats do support very simple mathematical typesetting (such as the usage of basic symbols such as π, or superscripts such as x2), it is difficult to create more sophisticated (and non-ugly) mathematical displays, such as
without some additional layer of software (in this case, WordPress’s LaTeX renderer). These type of ad hoc fixes work, up to a point, but several difficulties still remain. For instance:
- There is no standardisation with regard to mathematics displays. For instance, WordPress uses $latex and $ to indicate a mathematics display, Wikipedia uses <math> and </math>, the current experimental Google Wave plugins use $$ and $$, and so forth.
- Mathematical formulae need to be compiled from a plain text language (much as with LaTeX), rather than edited directly on a visual editor. This is in contrast to other HTML elements, such as links, boldface, colors, etc.
- One cannot easily cut and paste a portion of a web page containing maths displays into another page or file (although with WordPress’s format, things are not so bad as the raw LaTeX code will be captured as plain text). Again, this is in contrast to other HTML elements, which can be cut and pasted quite easily.
- Currently, mathematical displays are usually rendered as static images and thus cannot be easily edited without recompiling the source code for that display. A related issue is that the images do not automatically resize when the browser scale changes; also, in some cases they do not blend well with the background colour scheme for the page.
- It is difficult to take an extended portion of LaTeX and convert it into a web page or vice versa, although tools such as Luca Trevisan’s LaTeX to WordPress converter achieve a heroic (and very useful) level of partial success in this regard.
There are a number of extensions to the existing web languages that have been proposed to address some of these difficulties, the most well known of which is probably MathML, which is used for instance in the n-Category Café. So far, though, adoption of the MathML standard (and development of editors and other tools to take advantage of this standard) seems to not be too widespread at present.
I’d like to open a discussion, then, about what kinds of changes to the current web standards could help facilitate the easier use of mathematical displays on web pages. (I’m indirectly in contact with some people involved in these standards, so if some interesting discussions arise here, I can try to pass them on.)
A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:
Theorem 1 (Plünnecke-Ruzsa inequality) Let
be finite non-empty subsets of an additive group
, such that
for all
and some scalars
. Then there exists a subset
of
such that
.
The proof uses graph-theoretic techniques. Setting , we obtain a useful corollary: if
has small doubling in the sense that
, then we have
for all
, where
is the sum of
copies of
.
In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as in
are replaced with discrete random variables
taking values in
, and (the logarithm of) cardinality
of a set
is replaced by Shannon entropy
of a random variable
. (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.
I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:
Theorem 2 (Entropy Plünnecke-Ruzsa inequality) Let
be independent random variables of finite entropy taking values in an additive group
, such that
for all
and some scalars
. Then
.
In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set to a subset
, but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if
, then
for all
, where
are independent copies of
; this improves slightly over the analogous combinatorial inequality. Indeed, the function
is concave (this can be seen by using the
version of Theorem 2 (or (2) below) to show that the quantity
is decreasing in
).
Theorem 2 is actually a quick consequence of the submodularity inequality
are discrete random variables such that
and
each determine
(i.e.
is a function of
, and also a function of
), and
and
jointly determine
(i.e
is a function of
and
). To apply this, let
be independent discrete random variables taking values in
. Observe that the pairs
and
each determine
, and jointly determine
. Applying (1) we conclude that
which after using the independence of simplifies to the sumset submodularity inequality
case of Theorem 2). As a corollary of this inequality, we see that if
, then
and Theorem 2 follows by telescoping series.
The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of than
; see this paper and this paper respectively.
Here is a nice version of the periodic table (produced jointly by the Association for the British Pharmaceutical Industry, British Petroleum, the Chemical Industry Education Centre, and the Royal Society for Chemistry) that focuses on the applications of each of the elements, rather than their chemical properties. A simple idea, but remarkably effective in bringing the table to life.
It might be amusing to attempt something similar for mathematics, for instance creating a poster that takes each of the top-level categories in the AMS 2010 Mathematics Subject Classification scheme (or perhaps the arXiv math subject classification), and listing four or five applications of each, one of which would be illustrated by some simple artwork. (Except, of course, for those subfields that are “seldom found in nature”. :-) )
A project like this, which would need expertise both in mathematics and in graphic design, and which could be decomposed into several loosely interacting subprojects, seems amenable to a polymath-type approach; it seems to me that popularisation of mathematics is as valid an application of this paradigm as research mathematics. (Admittedly, there is a danger of “design by committee“, but a polymath project is not quite the same thing as a committee, and it would be an interesting experiment to see the relative strengths and weaknesses of this design method.) I’d be curious to see what readers would think of such an experiment.
[Update, Oct 25: A Math Overflow thread to collect applications of each of the major branches of mathematics has now been formed here, and is already rather active. Please feel free to contribute!]
[Via this post from the Red Ferret, which was suggested to me automatically via Google Reader's recommendation algorithm.]
Yehuda Shalom and I have just uploaded to the arXiv our paper “A finitary version of Gromov’s polynomial growth theorem“, to be submitted to Geom. Func. Anal.. The purpose of this paper is to establish a quantitative version of Gromov’s polynomial growth theorem which, among other things, is meaningful for finite groups. Here is a statement of Gromov’s theorem:
Gromov’s theorem. Let
be a group generated by a finite (symmetric) set
, and suppose that one has the polynomial growth condition
(1)
for all sufficiently large
and some fixed
, where
is the ball of radius
generated by
(i.e. the set of all words in
of length at most
, evaluated in
). Then
is virtually nilpotent, i.e. it has a finite index subgroup
which is nilpotent of some finite step
.
As currently stated, Gromov’s theorem is qualitative rather than quantitative; it does not specify any relationship between the input data (the growth exponent and the range of scales
for which one has (1)), and the output parameters (in particular, the index
of the nilpotent subgroup
of
, and the step
of that subgroup). However, a compactness argument (sketched in this previous blog post) shows that some such relationship must exist; indeed, if one has (1) for all
for some sufficiently large
, then one can ensure
has index at most
and step at most
for some quantities
,
; thus Gromov’s theorem is inherently a “local” result which only requires one to multiply the generator set
a finite number
of times before one sees the virtual nilpotency of the group. However, the compactness argument does not give an explicit value to the quantities
, and the nature of Gromov’s proof (using, in particular, the deep Montgomery-Zippin-Yamabe theory on Hilbert’s fifth problem) does not easily allow such an explicit value to be extracted.
Another point is that the original formulation of Gromov’s theorem required the polynomial bound (1) at all sufficiently large scales . A later proof of this theorem by van den Dries and Wilkie relaxed this hypothesis to requiring (1) just for infinitely many scales
; the later proof by Kleiner (which I blogged about here) also has this relaxed hypothesis.
Our main result reduces the hypothesis (1) to a single large scale, and makes most of the qualitative dependencies in the theorem quantitative:
Theorem 1. If (1) holds for some
for some sufficiently large absolute constant
, then
contains a finite index subgroup
which is nilpotent of step at most
.
The argument does in principle provide a bound on the index of in
, but it is very poor (of Ackermann type). If instead one is willing to relax “nilpotent” to “polycyclic“, the bounds on the index are somewhat better (of tower exponential type), though still far from ideal.
There is a related finitary analogue of Gromov’s theorem by Makarychev and Lee, which asserts that any finite group of uniformly polynomial growth has a subgroup with a large abelianisation. The quantitative bounds in that result are quite strong, but on the other hand the hypothesis is also strong (it requires upper and lower bounds of the form (1) at all scales) and the conclusion is a bit weaker than virtual nilpotency. The argument is based on a modification of Kleiner’s proof.
Our argument also proceeds by modifying Kleiner’s proof of Gromov’s theorem (a significant fraction of which was already quantitative), and carefully removing all of the steps which require one to take an asymptotic limit. To ease this task, we look for the most elementary arguments available for each step of the proof (thus consciously avoiding powerful tools such as the Tits alternative). A key technical issue is that because there is only a single scale for which one has polynomial growth, one has to work at scales significantly less than
in order to have any chance of keeping control of the various groups and other objects being generated.
Below the fold, I discuss a stripped down version of Kleiner’s argument, and then how we convert it to a fully finitary argument.
At UCLA we just concluded our third seminar in our reading of “Stable group theory and approximate subgroups” by Ehud Hrushovski. In this seminar, Isaac Goldbring made some more general remarks about universal saturated models (extending the discussion from the previous seminar), and then Henry Towsner gave some preliminaries on Kiesler measures, in preparation for connecting the main model-theoretic theorem (Theorem 3.4 of Hrushovski) to one of the combinatorial applications (Corollary 1.2 of Hrushovski).
As with the previous post, commentary on any topic related to Hrushovski’s paper is welcome, even if it is not directly related to what is under discussion by the UCLA group. Also, we have a number of questions below which perhaps some of the readers here may be able to help answer.
Note: the notes here are quite rough; corrections are very welcome. Henry’s notes on his part of the seminar can be found here.
(Thanks to Issac Goldbring for comments.)


Recent Comments