This week, Henry Towsner concluded his portion of reading seminar of the Hrushovski paper, by discussing (a weaker, simplified version of) main model-theoretic theorem (Theorem 3.4 of Hrushovski), and described how this theorem implied the combinatorial application in Corollary 1.2 of Hrushovski. The presentation here differs slightly from that in Hrushovski’s paper, for instance by avoiding mention of the more general notions of S1 ideals and forking.

Here is a collection of resources so far on the Hrushovski paper:

Read the rest of this entry »

A fundamental tool in any mathematician’s toolkit is that of reductio ad absurdum: showing that a statement {X} is false by assuming first that {X} is true, and showing that this leads to a logical contradiction. A particulary pure example of reductio ad absurdum occurs when establishing the non-existence of a hypothetically overpowered object or structure {X}, by showing that {X}’s powers are “self-defeating”: the very existence of {X} and its powers can be used (by some clever trick) to construct a counterexample to that power. Perhaps the most well-known example of a self-defeating object comes from the omnipotence paradox in philosophy (“Can an omnipotent being create a rock so heavy that He cannot lift it?”); more generally, a large number of other paradoxes in logic or philosophy can be reinterpreted as a proof that a certain overpowered object or structure does not exist.

In mathematics, perhaps the first example of a self-defeating object one encounters is that of a largest natural number:

Proposition 1 (No largest natural number) There does not exist a natural number {N} which is larger than all other natural numbers.

Proof: Suppose for contradiction that there was such a largest natural number {N}. Then {N+1} is also a natural number which is strictly larger than {N}, contradicting the hypothesis that {N} is the largest natural number. \Box

Note the argument does not apply to the extended natural number system in which one adjoins an additional object {\infty} beyond the natural numbers, because {\infty+1} is defined equal to {\infty}. However, the above argument does show that the existence of a largest number is not compatible with the Peano axioms.

This argument, by the way, is perhaps the only mathematical argument I know of which is routinely taught to primary school children by other primary school children, thanks to the schoolyard game of naming the largest number. It is arguably one’s first exposure to a mathematical non-existence result, which seems innocuous at first but can be surprisingly deep, as such results preclude in advance all future attempts to establish existence of that object, no matter how much effort or ingenuity is poured into this task. One sees this in a typical instance of the above schoolyard game; one player tries furiously to cleverly construct some impressively huge number {N}, but no matter how much effort is expended in doing so, the player is defeated by the simple response “… plus one!” (unless, of course, {N} is infinite, ill-defined, or otherwise not a natural number).

It is not only individual objects (such as natural numbers) which can be self-defeating; structures (such as orderings or enumerations) can also be self-defeating. (In modern set theory, one considers structures to themselves be a kind of object, and so the distinction between the two concepts is often blurred.) Here is one example (related to, but subtly different from, the previous one):

Proposition 2 (The natural numbers cannot be finitely enumerated) The natural numbers {{\Bbb N} = \{0,1,2,3,\ldots\}} cannot be written as {\{ a_1,\ldots,a_n\}} for any finite collection {a_1,\ldots,a_n} of natural numbers.

Proof: Suppose for contradiction that such an enumeration {{\Bbb N} = \{a_1,\ldots,a_n\}} existed. Then consider the number {a_1+\ldots+a_n+1}; this is a natural number, but is larger than (and hence not equal to) any of the natural numbers {a_1,\ldots,a_n}, contradicting the hypothesis that {{\Bbb N}} is enumerated by {a_1,\ldots,a_n}. \Box

Here it is the enumeration which is self-defeating, rather than any individual natural number such as {a_1} or {a_n}. (For this post, we allow enumerations to contain repetitions.)

The above argument may seem trivial, but a slight modification of it already gives an important result, namely Euclid’s theorem:

Proposition 3 (The primes cannot be finitely enumerated) The prime numbers {{\mathcal P} = \{2,3,5,7,\ldots\}} cannot be written as {\{p_1,\ldots,p_n\}} for any finite collection of prime numbers.

Proof: Suppose for contradiction that such an enumeration {{\mathcal P} = \{p_1,\ldots,p_n\}} existed. Then consider the natural number {p_1 \times \ldots \times p_n + 1}; this is a natural number larger than {1} which is not divisible by any of the primes {p_1,\ldots,p_n}. But, by the fundamental theorem of arithmetic (or by the method of Infinite descent, which is another classic application of reductio ad absurdum), every natural number larger than {1} must be divisible by some prime, contradicting the hypothesis that {{\mathcal P}} is enumerated by {p_1,\ldots,p_n}. \Box

Remark 1 Continuing the number-theoretic theme, the “dueling conspiracies” arguments in a previous blog post can also be viewed as an instance of this type of “no-self-defeating-object”; for instance, a zero of the Riemann zeta function at {1+it} implies that the primes anti-correlate almost completely with the multiplicative function {n^{it}}, which is self-defeating because it also implies complete anti-correlation with {n^{-it}}, and the two are incompatible. Thus we see that the prime number theorem (a much stronger version of Proposition 3) also emerges from this type of argument.

In this post I would like to collect several other well-known examples of this type of “no self-defeating object” argument. Each of these is well studied, and probably quite familiar to many of you, but I feel that by collecting them all in one place, the commonality of theme between these arguments becomes more apparent. (For instance, as we shall see, many well-known “paradoxes” in logic and philosophy can be interpreted mathematically as a rigorous “no self-defeating object” argument.)

Read the rest of this entry »

As the previous discussion on displaying mathematics on the web has become quite lengthy, I am opening a fresh post to continue the topic.  I’m leaving the previous thread open for those who wish to respond directly to some specific comments in that thread, but otherwise it would be preferable to start afresh on this thread to make it easier to follow the discussion.

It’s not easy to summarise the discussion so far, but the comments have identified several existing formats for displaying (and marking up) mathematics on the web (mathMLjsMath, MathJaxOpenMath), as well as a surprisingly large number of tools for converting mathematics into web friendly formats (e.g.  LaTeX2HTMLLaTeXMathML, LaTeX2WPWindows 7 Math Inputitex2MMLRitexGellmumathTeXWP-LaTeXTeX4htblahtexplastexTtHWebEQtechexplorer, etc.).  Some of the formats are not widely supported by current software, and by current browsers in particular, but it seems that the situation will improve with the next generation of these browsers.

It seems that the tools that already exist are enough to improvise a passable way of displaying mathematics in various formats online, though there are still significant issues with accessibility, browser support, and ease of use.  Even if all these issues are resolved, though, I still feel that something is still missing.    Currently, if I want to transfer some mathematical content from one location to another (e.g. from a LaTeX file to a blog, or from a wiki to a PDF, or from email to an online document, or whatever), or to input some new piece of mathematics, I have to think about exactly what format I need for the task at hand, and what conversion tool may be needed.  In contrast, if one looks at non-mathematical content such as text, links, fonts, non-Latin alphabets, colours, tables, images, or even video, the formats here have been standardised, and one can manipulate this type of content in both online and offline formats more or less seamlessly (in principle, at least – there is still room for improvement), without the need for any particularly advanced technical expertise.  It doesn’t look like we’re anywhere near that level currently with regards to mathematical content, though presumably things will improve when a single mathematics presentation standard, such as mathML, becomes universally adopted and supported in browsers, in operating systems, and in other various pieces of auxiliary software.

Anyway, it has been a very interesting and educational discussion for me, and hopefully for others also; I look forward to any further thoughts that readers have on these topics.  (Also, feel free to recapitulate some of the points from the previous thread; the discussion has been far too multifaceted for me to attempt a coherent summary by myself.)

This week, Henry Towsner continued some model-theoretic preliminaries for the reading seminar of the Hrushovski paper, particularly regarding the behaviour of wide types, leading up to the main model-theoretic theorem (Theorem 3.4 of Hrushovski) which in turn implies the various combinatorial applications (such as Corollary 1.2 of Hrushovski). Henry’s notes can be found here.

A key theme here is the phenomenon that any pair of large sets contained inside a definable set of finite measure (such as {X \cdot X^{-1}}) must intersect if they are sufficiently “generic”; the notion of a wide type is designed, in part, to capture this notion of genericity.

Read the rest of this entry »

The various languages and formats that make up modern web pages (HTML, XHTML, CSS, etc.) work wonderfully for most purposes, but there is one place where they are still somewhat clunky, namely in the presentation of mathematical equations and diagrams on web pages. While web formats do support very simple mathematical typesetting (such as the usage of basic symbols such as π, or superscripts such as x2), it is difficult to create more sophisticated (and non-ugly) mathematical displays, such as

\displaystyle \hbox{det} \begin{pmatrix} 1 & x_1 & \ldots & x_1^{n-1} \\ 1 & x_2 & \ldots & x_2^{n-1} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & \ldots & x_n^{n-1} \end{pmatrix} = \prod_{1 \leq i < j \leq n} (x_j - x_i)

without some additional layer of software (in this case, WordPress’s LaTeX renderer). These type of ad hoc fixes work, up to a point, but several difficulties still remain. For instance:

  1. There is no standardisation with regard to mathematics displays. For instance, WordPress uses $latex and $ to indicate a mathematics display, Wikipedia uses <math> and </math>, the current experimental Google Wave plugins use $$ and $$, and so forth.
  2. Mathematical formulae need to be compiled from a plain text language (much as with LaTeX), rather than edited directly on a visual editor. This is in contrast to other HTML elements, such as links, boldface, colors, etc.
  3. One cannot easily cut and paste a portion of a web page containing maths displays into another page or file (although with WordPress’s format, things are not so bad as the raw LaTeX code will be captured as plain text). Again, this is in contrast to other HTML elements, which can be cut and pasted quite easily.
  4. Currently, mathematical displays are usually rendered as static images and thus cannot be easily edited without recompiling the source code for that display. A related issue is that the images do not automatically resize when the browser scale changes; also, in some cases they do not blend well with the background colour scheme for the page.
  5. It is difficult to take an extended portion of LaTeX and convert it into a web page or vice versa, although tools such as Luca Trevisan’s LaTeX to WordPress converter achieve a heroic (and very useful) level of partial success in this regard.

There are a number of extensions to the existing web languages that have been proposed to address some of these difficulties, the most well known of which is probably MathML, which is used for instance in the n-Category Café. So far, though, adoption of the MathML standard (and development of editors and other tools to take advantage of this standard) seems to not be too widespread at present.

I’d like to open a discussion, then, about what kinds of changes to the current web standards could help facilitate the easier use of mathematical displays on web pages. (I’m indirectly in contact with some people involved in these standards, so if some interesting discussions arise here, I can try to pass them on.)

A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:

Theorem 1 (Plünnecke-Ruzsa inequality) Let {A, B_1, \ldots, B_m} be finite non-empty subsets of an additive group {G}, such that {|A+B_i| \leq K_i |A|} for all {1 \leq i \leq m} and some scalars {K_1,\ldots,K_m \geq 1}. Then there exists a subset {A'} of {A} such that {|A' + B_1 + \ldots + B_m| \leq K_1 \ldots K_m |A'|}.

The proof uses graph-theoretic techniques. Setting {A=B_1=\ldots=B_m}, we obtain a useful corollary: if {A} has small doubling in the sense that {|A+A| \leq K|A|}, then we have {|mA| \leq K^m |A|} for all {m \geq 1}, where {mA = A + \ldots + A} is the sum of {m} copies of {A}.

In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as {A} in {G} are replaced with discrete random variables {X} taking values in {G}, and (the logarithm of) cardinality {|A|} of a set {A} is replaced by Shannon entropy {{\Bbb H}(X)} of a random variable {X}. (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.

I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:

Theorem 2 (Entropy Plünnecke-Ruzsa inequality) Let {X, Y_1, \ldots, Y_m} be independent random variables of finite entropy taking values in an additive group {G}, such that {{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i} for all {1 \leq i \leq m} and some scalars {K_1,\ldots,K_m \geq 1}. Then {{\Bbb H}(X+Y_1+\ldots+Y_m) \leq {\Bbb H}(X) + \log K_1 \ldots K_m}.

In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set {A} to a subset {A'}, but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if {{\Bbb H}(X_1+X_2) \leq {\Bbb H}(X) + \log K}, then {{\Bbb H}(X_1+\ldots+X_m) \leq {\Bbb H}(X) + (m-1) \log K} for all {m \geq 1}, where {X_1,\ldots,X_m} are independent copies of {X}; this improves slightly over the analogous combinatorial inequality. Indeed, the function {m \mapsto {\Bbb H}(X_1+\ldots+X_m)} is concave (this can be seen by using the {m=2} version of Theorem 2 (or (2) below) to show that the quantity {{\Bbb H}(X_1+\ldots+X_{m+1})-{\Bbb H}(X_1+\ldots+X_m)} is decreasing in {m}).

Theorem 2 is actually a quick consequence of the submodularity inequality

\displaystyle  {\Bbb H}(W) + {\Bbb H}(X) \leq {\Bbb H}(Y) + {\Bbb H}(Z) \ \ \ \ \ (1)

in information theory, which is valid whenever {X,Y,Z,W} are discrete random variables such that {Y} and {Z} each determine {X} (i.e. {X} is a function of {Y}, and also a function of {Z}), and {Y} and {Z} jointly determine {W} (i.e {W} is a function of {Y} and {Z}). To apply this, let {X, Y, Z} be independent discrete random variables taking values in {G}. Observe that the pairs {(X,Y+Z)} and {(X+Y,Z)} each determine {X+Y+Z}, and jointly determine {(X,Y,Z)}. Applying (1) we conclude that

\displaystyle  {\Bbb H}(X,Y,Z) + {\Bbb H}(X+Y+Z) \leq {\Bbb H}(X,Y+Z) + {\Bbb H}(X+Y,Z)

which after using the independence of {X,Y,Z} simplifies to the sumset submodularity inequality

\displaystyle  {\Bbb H}(X+Y+Z) + {\Bbb H}(Y) \leq {\Bbb H}(X+Y) + {\Bbb H}(Y+Z) \ \ \ \ \ (2)

(this inequality was also recently observed by Madiman; it is the {m=2} case of Theorem 2). As a corollary of this inequality, we see that if {{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i}, then

\displaystyle  {\Bbb H}(X+Y_1+\ldots+Y_i) \leq {\Bbb H}(X+Y_1+\ldots+Y_{i-1}) + \log K_i,

and Theorem 2 follows by telescoping series.

The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of {X,Y_1,\ldots,Y_m} than {X+Y_i}; see this paper and this paper respectively.

Read the rest of this entry »

Here is a nice version of the periodic table (produced jointly by the Association for the British Pharmaceutical Industry, British Petroleum, the Chemical Industry Education Centre, and the Royal Society for Chemistry) that focuses on the applications of each of the elements, rather than their chemical properties.  A simple idea, but remarkably effective in bringing the table to life.

elements

It might be amusing to attempt something similar for mathematics, for instance creating a poster that takes each of the top-level categories in the AMS 2010 Mathematics Subject Classification scheme (or perhaps the arXiv math subject classification), and listing four or five applications of each, one of which would be illustrated by some simple artwork.  (Except, of course, for those subfields that are “seldom found in nature”. :-) )

A project like this, which would need expertise both in mathematics and in graphic design, and which could be decomposed into several loosely interacting subprojects, seems amenable to a polymath-type approach; it seems to me that popularisation of mathematics is as valid an application of this paradigm as research mathematics.   (Admittedly, there is a danger of “design by committee“, but a polymath project is not quite the same thing as a committee, and it would be an interesting experiment to see the relative strengths and weaknesses of this design method.)   I’d be curious to see what readers would think of such an experiment.

[Update, Oct 25: A Math Overflow thread to collect applications of each of the major branches of mathematics has now been formed here, and is already rather active.  Please feel free to contribute!]

[Via this post from the Red Ferret, which was suggested to me automatically via Google Reader's recommendation algorithm.]

Yehuda Shalom and I have just uploaded to the arXiv our paper “A finitary version of Gromov’s polynomial growth theorem“, to be submitted to Geom. Func. Anal..  The purpose of this paper is to establish a quantitative version of Gromov’s polynomial growth theorem which, among other things, is meaningful for finite groups.   Here is a statement of Gromov’s theorem:

Gromov’s theorem. Let G be a group generated by a finite (symmetric) set S, and suppose that one has the polynomial growth condition

|B_S(R)| \leq R^d (1)

for all sufficiently large R and some fixed d, where B_S(R) is the ball of radius R generated by S (i.e. the set of all words in S of length at most d, evaluated in G).  Then G is virtually nilpotent, i.e. it has a finite index subgroup H which is nilpotent of some finite step s.

As currently stated, Gromov’s theorem is qualitative rather than quantitative; it does not specify any relationship between the input data (the growth exponent d and the range of scales R for which one has (1)), and the output parameters (in particular, the index |G/H| of the nilpotent subgroup H of G, and the step s of that subgroup).  However, a compactness argument (sketched in this previous blog post) shows that some such relationship must exist; indeed, if one has (1) for all R_0 \leq R \leq C( R_0, d ) for some sufficiently large C(R_0,d), then one can ensure H has index at most C'(R_0,d) and step at most C''(R_0,d) for some quantities C'(R_0,d), C''(R_0,d); thus Gromov’s theorem is inherently a “local” result which only requires one to multiply the generator set S a finite number C(R_0,d) of times before one sees the virtual nilpotency of the group.  However, the compactness argument does not give an explicit value to the quantities C(R_0,d), C'(R_0,d), C''(R_0,d), and the nature of Gromov’s proof (using, in particular, the deep Montgomery-Zippin-Yamabe theory on Hilbert’s fifth problem) does not easily allow such an explicit value to be extracted.

Another point is that the original formulation of Gromov’s theorem required the polynomial bound (1) at all sufficiently large scales R.  A later proof of this theorem by van den Dries and Wilkie relaxed this hypothesis to requiring (1) just for infinitely many scales R; the later proof by Kleiner (which I blogged about here) also has this relaxed hypothesis.

Our main result reduces the hypothesis (1) to a single large scale, and makes most of the qualitative dependencies in the theorem quantitative:

Theorem 1. If (1) holds for some R > \exp(\exp(C d^C)) for some sufficiently large absolute constant C, then G contains a finite index subgroup H which is nilpotent of step at most C^d.

The argument does in principle provide a bound on the index of H in G, but it is very poor (of Ackermann type).  If instead one is willing to relax “nilpotent” to “polycyclic“, the bounds on the index are somewhat better (of tower exponential type), though still far from ideal.

There is a related finitary analogue of Gromov’s theorem by Makarychev and Lee, which asserts that any finite group of uniformly polynomial growth has a subgroup with a large abelianisation.  The quantitative bounds in that result are quite strong, but on the other hand the hypothesis is also strong (it requires upper and lower bounds of the form (1) at all scales) and the conclusion is a bit weaker than virtual nilpotency.  The argument is based on a modification of Kleiner’s proof.

Our argument also proceeds by modifying Kleiner’s proof of Gromov’s theorem (a significant fraction of which was already quantitative), and carefully removing all of the steps which require one to take an asymptotic limit.  To ease this task, we look for the most elementary arguments available for each step of the proof (thus consciously avoiding powerful tools such as the Tits alternative).  A key technical issue is that because there is only a single scale R for which one has polynomial growth, one has to work at scales significantly less than R in order to have any chance of keeping control of the various groups and other objects being generated.

Below the fold, I discuss a stripped down version of Kleiner’s argument, and then how we convert it to a fully finitary argument.

Read the rest of this entry »

At UCLA we just concluded our third seminar in our reading of “Stable group theory and approximate subgroups” by Ehud Hrushovski. In this seminar, Isaac Goldbring made some more general remarks about universal saturated models (extending the discussion from the previous seminar), and then Henry Towsner gave some preliminaries on Kiesler measures, in preparation for connecting the main model-theoretic theorem (Theorem 3.4 of Hrushovski) to one of the combinatorial applications (Corollary 1.2 of Hrushovski).

As with the previous post, commentary on any topic related to Hrushovski’s paper is welcome, even if it is not directly related to what is under discussion by the UCLA group. Also, we have a number of questions below which perhaps some of the readers here may be able to help answer.

Note: the notes here are quite rough; corrections are very welcome. Henry’s notes on his part of the seminar can be found here.

(Thanks to Issac Goldbring for comments.)

Read the rest of this entry »

The polymath1 project has just uploaded to the arXiv the paper “A new proof of the density Hales-Jewett theorem“, to be submitted shortly.  Special thanks here go to Ryan O’Donnell for performing the lion’s share of the writing up of the results, and to Tim Gowers for running a highly successful online mathematical experiment.

I’ll state the main result in the first non-trivial case k=3 for simplicity, though the methods extend surprisingly easily to higher k (but with significantly worse bounds).  Let c_{n,3} be the size of the largest subset of the cube [3]^n = \{1,2,3\}^n that does not contain any combinatorial line.    The density Hales-Jewett theorem of Furstenberg and Katznelson shows that c_{n,3} = o(3^n).  In the course of the Polymath1 project, the explicit values

c_{0,3} = 1; c_{1,3} = 2; c_{2,3} = 6; c_{3,3} = 18; c_{4,3} = 52; c_{5,3} =150; c_{6,3} = 450

were established, as well as the asymptotic lower bound

c_{n,3} \geq 3^n \exp( - O( \sqrt{ \log n } ) )

(actually we have a slightly more precise bound than this).  The main result of this paper is then

Theorem. (k=3 version) c_{n,3} \ll 3^n / \log^{1/2}_* n.

Here \log_* n is the inverse tower exponential function; it is the number of times one has to take (natural) logarithms until one drops below 1.  So it does go to infinity, but extremely slowly.  Nevertheless, this is the first explicitly quantitative version of the density Hales-Jewett theorem.

The argument is based on the density increment argument as pioneered by Roth, and also used in later papers of Ajtai-Szemerédi and Shkredov on the corners problem, which was also influential in our current work (though, perhaps paradoxically, the generality of our setting makes our argument simpler than the above arguments, in particular allowing one to avoid use of the Fourier transform, regularity lemma, or Szemerédi’s theorem).   I discuss the argument in the first part of this previous blog post.

I’ll end this post with an open problem.  In our paper, we cite the work of P. L. Varnavides, who was the first to observe the elementary averaging argument that showed that Roth’s theorem (which showed that dense sets of integers contained at least one progression of length three) could be amplified (to show that there was in some sense a “dense” set of arithmetic progressions of length three).  However, despite much effort, we were not able to expand “P.” into the first name.  As one final task of the Polymath1 project, perhaps some readers with skills in detective work could lend a hand in finding out what Varnavides’ first name was? Update, Oct 22: Mystery now solved; see comments.

Categories