You are currently browsing the monthly archive for October 2009.

This week, Henry Towsner continued some model-theoretic preliminaries for the reading seminar of the Hrushovski paper, particularly regarding the behaviour of wide types, leading up to the main model-theoretic theorem (Theorem 3.4 of Hrushovski) which in turn implies the various combinatorial applications (such as Corollary 1.2 of Hrushovski). Henry’s notes can be found here.

A key theme here is the phenomenon that any pair of large sets contained inside a definable set of finite measure (such as ) must intersect if they are sufficiently “generic”; the notion of a wide type is designed, in part, to capture this notion of genericity.

The various languages and formats that make up modern web pages (HTML, XHTML, CSS, etc.) work wonderfully for most purposes, but there is one place where they are still somewhat clunky, namely in the presentation of mathematical equations and diagrams on web pages. While web formats do support very simple mathematical typesetting (such as the usage of basic symbols such as π, or superscripts such as *x ^{2}*), it is difficult to create more sophisticated (and non-ugly) mathematical displays, such as

without some additional layer of software (in this case, WordPress’s LaTeX renderer). These type of *ad hoc* fixes work, up to a point, but several difficulties still remain. For instance:

- There is no standardisation with regard to mathematics displays. For instance, WordPress uses $latex and $ to indicate a mathematics display, Wikipedia uses <math> and </math>, the current experimental Google Wave plugins use $$ and $$, and so forth.
- Mathematical formulae need to be compiled from a plain text language (much as with LaTeX), rather than edited directly on a visual editor. This is in contrast to other HTML elements, such as links,
**boldface**, colors, etc. - One cannot easily cut and paste a portion of a web page containing maths displays into another page or file (although with WordPress’s format, things are not so bad as the raw LaTeX code will be captured as plain text). Again, this is in contrast to other HTML elements, which can be cut and pasted quite easily.
- Currently, mathematical displays are usually rendered as static images and thus cannot be easily edited without recompiling the source code for that display. A related issue is that the images do not automatically resize when the browser scale changes; also, in some cases they do not blend well with the background colour scheme for the page.
- It is difficult to take an extended portion of LaTeX and convert it into a web page or vice versa, although tools such as Luca Trevisan’s LaTeX to WordPress converter achieve a heroic (and very useful) level of partial success in this regard.

There are a number of extensions to the existing web languages that have been proposed to address some of these difficulties, the most well known of which is probably MathML, which is used for instance in the n-Category Café. So far, though, adoption of the MathML standard (and development of editors and other tools to take advantage of this standard) seems to not be too widespread at present.

I’d like to open a discussion, then, about what kinds of changes to the current web standards could help facilitate the easier use of mathematical displays on web pages. (I’m indirectly in contact with some people involved in these standards, so if some interesting discussions arise here, I can try to pass them on.)

A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:

Theorem 1 (Plünnecke-Ruzsa inequality)Let be finite non-empty subsets of an additive group , such that for all and some scalars . Then there exists a subset of such that .

The proof uses graph-theoretic techniques. Setting , we obtain a useful corollary: if has small doubling in the sense that , then we have for all , where is the sum of copies of .

In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as in are replaced with discrete random variables taking values in , and (the logarithm of) cardinality of a set is replaced by Shannon entropy of a random variable . (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.

I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:

Theorem 2 (Entropy Plünnecke-Ruzsa inequality)Let be independent random variables of finite entropy taking values in an additive group , such that for all and some scalars . Then .

In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set to a subset , but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if , then for all , where are independent copies of ; this improves slightly over the analogous combinatorial inequality. Indeed, the function is concave (this can be seen by using the version of Theorem 2 (or (2) below) to show that the quantity is decreasing in ).

Theorem 2 is actually a quick consequence of the *submodularity inequality*

in information theory, which is valid whenever are discrete random variables such that and each determine (i.e. is a function of , and also a function of ), and and jointly determine (i.e is a function of and ). To apply this, let be independent discrete random variables taking values in . Observe that the pairs and each determine , and jointly determine . Applying (1) we conclude that

which after using the independence of simplifies to the *sumset submodularity inequality*

(this inequality was also recently observed by Madiman; it is the case of Theorem 2). As a corollary of this inequality, we see that if , then

and Theorem 2 follows by telescoping series.

The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of than ; see this paper and this paper respectively.

Here is a nice version of the periodic table (produced jointly by the Association for the British Pharmaceutical Industry, British Petroleum, the Chemical Industry Education Centre, and the Royal Society for Chemistry) that focuses on the applications of each of the elements, rather than their chemical properties. A simple idea, but remarkably effective in bringing the table to life.

It might be amusing to attempt something similar for mathematics, for instance creating a poster that takes each of the top-level categories in the AMS 2010 Mathematics Subject Classification scheme (or perhaps the arXiv math subject classification), and listing four or five applications of each, one of which would be illustrated by some simple artwork. (Except, of course, for those subfields that are “seldom found in nature”. :-) )

A project like this, which would need expertise both in mathematics and in graphic design, and which could be decomposed into several loosely interacting subprojects, seems amenable to a polymath-type approach; it seems to me that popularisation of mathematics is as valid an application of this paradigm as research mathematics. (Admittedly, there is a danger of “design by committee“, but a polymath project is not quite the same thing as a committee, and it would be an interesting experiment to see the relative strengths and weaknesses of this design method.) I’d be curious to see what readers would think of such an experiment.

[*Update*, Oct 25: A Math Overflow thread to collect applications of each of the major branches of mathematics has now been formed here, and is already rather active. Please feel free to contribute!]

[Via this post from the Red Ferret, which was suggested to me automatically via Google Reader's recommendation algorithm.]

Yehuda Shalom and I have just uploaded to the arXiv our paper “A finitary version of Gromov’s polynomial growth theorem“, to be submitted to Geom. Func. Anal.. The purpose of this paper is to establish a quantitative version of Gromov’s polynomial growth theorem which, among other things, is meaningful for finite groups. Here is a statement of Gromov’s theorem:

Gromov’s theorem.Let be a group generated by a finite (symmetric) set , and suppose that one has the polynomial growth condition(1)

for all sufficiently large and some fixed , where is the ball of radius generated by (i.e. the set of all words in of length at most , evaluated in ). Then is virtually nilpotent, i.e. it has a finite index subgroup which is nilpotent of some finite step .

As currently stated, Gromov’s theorem is qualitative rather than quantitative; it does not specify any relationship between the input data (the growth exponent and the range of scales for which one has (1)), and the output parameters (in particular, the index of the nilpotent subgroup of , and the step of that subgroup). However, a compactness argument (sketched in this previous blog post) shows that some such relationship must exist; indeed, if one has (1) for all for some sufficiently large , then one can ensure has index at most and step at most for some quantities , ; thus Gromov’s theorem is inherently a “local” result which only requires one to multiply the generator set a finite number of times before one sees the virtual nilpotency of the group. However, the compactness argument does not give an explicit value to the quantities , and the nature of Gromov’s proof (using, in particular, the deep Montgomery-Zippin-Yamabe theory on Hilbert’s fifth problem) does not easily allow such an explicit value to be extracted.

Another point is that the original formulation of Gromov’s theorem required the polynomial bound (1) at *all* sufficiently large scales . A later proof of this theorem by van den Dries and Wilkie relaxed this hypothesis to requiring (1) just for infinitely many scales ; the later proof by Kleiner (which I blogged about here) also has this relaxed hypothesis.

Our main result reduces the hypothesis (1) to a single large scale, and makes most of the qualitative dependencies in the theorem quantitative:

Theorem 1.If (1) holds for some for some sufficiently large absolute constant , then contains a finite index subgroup which is nilpotent of step at most .

The argument does in principle provide a bound on the index of in , but it is very poor (of Ackermann type). If instead one is willing to relax “nilpotent” to “polycyclic“, the bounds on the index are somewhat better (of tower exponential type), though still far from ideal.

There is a related finitary analogue of Gromov’s theorem by Makarychev and Lee, which asserts that any finite group of uniformly polynomial growth has a subgroup with a large abelianisation. The quantitative bounds in that result are quite strong, but on the other hand the hypothesis is also strong (it requires upper and lower bounds of the form (1) at all scales) and the conclusion is a bit weaker than virtual nilpotency. The argument is based on a modification of Kleiner’s proof.

Our argument also proceeds by modifying Kleiner’s proof of Gromov’s theorem (a significant fraction of which was already quantitative), and carefully removing all of the steps which require one to take an asymptotic limit. To ease this task, we look for the most elementary arguments available for each step of the proof (thus consciously avoiding powerful tools such as the Tits alternative). A key technical issue is that because there is only a single scale for which one has polynomial growth, one has to work at scales significantly less than in order to have any chance of keeping control of the various groups and other objects being generated.

Below the fold, I discuss a stripped down version of Kleiner’s argument, and then how we convert it to a fully finitary argument.

At UCLA we just concluded our third seminar in our reading of “Stable group theory and approximate subgroups” by Ehud Hrushovski. In this seminar, Isaac Goldbring made some more general remarks about universal saturated models (extending the discussion from the previous seminar), and then Henry Towsner gave some preliminaries on Kiesler measures, in preparation for connecting the main model-theoretic theorem (Theorem 3.4 of Hrushovski) to one of the combinatorial applications (Corollary 1.2 of Hrushovski).

As with the previous post, commentary on any topic related to Hrushovski’s paper is welcome, even if it is not directly related to what is under discussion by the UCLA group. Also, we have a number of questions below which perhaps some of the readers here may be able to help answer.

Note: the notes here are quite rough; corrections are very welcome. Henry’s notes on his part of the seminar can be found here.

(Thanks to Issac Goldbring for comments.)

The polymath1 project has just uploaded to the arXiv the paper “A new proof of the density Hales-Jewett theorem“, to be submitted shortly. Special thanks here go to Ryan O’Donnell for performing the lion’s share of the writing up of the results, and to Tim Gowers for running a highly successful online mathematical experiment.

I’ll state the main result in the first non-trivial case for simplicity, though the methods extend surprisingly easily to higher (but with significantly worse bounds). Let be the size of the largest subset of the cube that does not contain any combinatorial line. The density Hales-Jewett theorem of Furstenberg and Katznelson shows that . In the course of the Polymath1 project, the explicit values

were established, as well as the asymptotic lower bound

(actually we have a slightly more precise bound than this). The main result of this paper is then

Theorem.( version) .

Here is the inverse tower exponential function; it is the number of times one has to take (natural) logarithms until one drops below 1. So it does go to infinity, but extremely slowly. Nevertheless, this is the first explicitly quantitative version of the density Hales-Jewett theorem.

The argument is based on the density increment argument as pioneered by Roth, and also used in later papers of Ajtai-Szemerédi and Shkredov on the corners problem, which was also influential in our current work (though, perhaps paradoxically, the generality of our setting makes our argument *simpler* than the above arguments, in particular allowing one to avoid use of the Fourier transform, regularity lemma, or Szemerédi’s theorem). I discuss the argument in the first part of this previous blog post.

I’ll end this post with an open problem. In our paper, we cite the work of P. L. Varnavides, who was the first to observe the elementary averaging argument that showed that Roth’s theorem (which showed that dense sets of integers contained at least one progression of length three) could be amplified (to show that there was in some sense a “dense” set of arithmetic progressions of length three). However, despite much effort, we were not able to expand “P.” into the first name. As one final task of the Polymath1 project, perhaps some readers with skills in detective work could lend a hand in finding out what Varnavides’ first name was? *Update, Oct 22:* Mystery now solved; see comments.

I’m lagging behind the rest of the maths blog community in reporting this, but there is an interesting (and remarkably active) new online maths experiment that has just been set up, called Math Overflow, in which participants can ask and answer research maths questions (though homework questions are discouraged). It reminds me to some extent of the venerable newsgroup sci.math, but with more modern, “Web 2.0″ features (for instance, participants can earn “points” for answering questions or rating comments, which then give administrative privileges, which seems to encourage participation). The activity and turnover rate is quite remarkable: perhaps an order of magnitude higher than a typical maths blog, and two orders higher than a typical maths wiki. It’s not clear that the model is transferable to these two settings, though.

There is an active discussion of Math Overflow over at the Secret Blogging Seminar. I don’t have much to add to that discussion, except to say that I am happy to see continued experimentation in various online mathematics formats; we still don’t fully understand what makes an online experiment succeed or fail (or get stuck halfway between the two extremes), and more data points like this are very valuable.

In his wonderful article “On proof and progress in mathematics“, Bill Thurston describes (among many other topics) how one’s understanding of given concept in mathematics (such as that of the derivative) can be vastly enriched by viewing it simultaneously from many subtly different perspectives; in the case of the derivative, he gives seven standard such perspectives (infinitesimal, symbolic, logical, geometric, rate, approximation, microscopic) and then mentions a much later perspective in the sequence (as describing a flat connection for a graph).

One can of course do something similar for many other fundamental notions in mathematics. For instance, the notion of a group can be thought of in a number of (closely related) ways, such as the following:

- (0) Motivating examples: A group is an abstraction of the operations of addition/subtraction or multiplication/division in arithmetic or linear algebra, or of composition/inversion of transformations.
- (1) Universal algebraic: A group is a set with an identity element , a unary inverse operation , and a binary multiplication operation obeying the relations (or axioms) , , for all .
- (2) Symmetric: A group is all the ways in which one can transform a space to itself while preserving some object or structure on this space.
- (3) Representation theoretic: A group is identifiable with a collection of transformations on a space which is closed under composition and inverse, and contains the identity transformation.
- (4) Presentation theoretic: A group can be generated by a collection of generators subject to some number of relations.
- (5) Topological: A group is the fundamental group of a connected topological space .
- (6) Dynamic: A group represents the passage of time (or of some other variable(s) of motion or action) on a (reversible) dynamical system.
- (7) Category theoretic: A group is a category with one object, in which all morphisms have inverses.
- (8) Quantum: A group is the classical limit of a quantum group.
- etc.

One can view a large part of group theory (and related subjects, such as representation theory) as exploring the interconnections between various of these perspectives. As one’s understanding of the subject matures, many of these formerly distinct perspectives slowly merge into a single unified perspective.

From a recent talk by Ezra Getzler, I learned a more sophisticated perspective on a group, somewhat analogous to Thurston’s example of a sophisticated perspective on a derivative (and coincidentally, flat connections play a central role in both):

- (37) Sheaf theoretic: A group is identifiable with a (set-valued) sheaf on the category of simplicial complexes such that the morphisms associated to collapses of -simplices are bijective for (and merely surjective for ).

This interpretation of the group concept is apparently due to Grothendieck, though it is motivated also by homotopy theory. One of the key advantages of this interpretation is that it generalises easily to the notion of an -group (simply by replacing with in (37)), whereas the other interpretations listed earlier require a certain amount of subtlety in order to generalise correctly (in particular, they usually themselves require higher-order notions, such as -categories).

The connection of (37) with any of the other perspectives of a group is elementary, but not immediately obvious; I enjoyed working out exactly what the connection was, and thought it might be of interest to some readers here, so I reproduce it below the fold.

[Note: my reconstruction of Grothendieck's perspective, and of the appropriate terminology, is likely to be somewhat inaccurate in places: corrections are of course very welcome.]

One of my favorite open problems, which I have blogged about in the past, is that of establishing (or even correctly formulating) a non-commutative analogue of Freiman’s theorem. Roughly speaking, the question is this: given a finite set in a non-commutative group which is of *small doubling* in the sense that the product set is not much larger than (e.g. for some ), what does this say about the structure of ? (For various technical reasons one may wish to replace small doubling by, say, small tripling (i.e. ), and one may also wish to assume that contains the identity and is symmetric, , but these are relatively minor details.)

Sets of small doubling (or tripling), etc. can be thought of as “approximate groups”, since groups themselves have a doubling constant equal to one. Another obvious example of an approximate group is that of an arithmetic progression in an additive group, and more generally of a ball (in the word metric) in a nilpotent group of bounded rank and step. It is tentatively conjectured that in fact all examples can somehow be “generated” out of these basic examples, although it is not fully clear at present what “generated” should mean.

A weaker conjecture along the same lines is that if is a set of small doubling, then there should be some sort of “pseudo-metric” on which is left-invariant, and for which is controlled (in some suitable sense) by the unit ball in this metric. (For instance, if was a subgroup of , one would take the metric which identified all the left cosets of to a point, but was otherwise a discrete metric; if were a ball in a nilpotent group, one would use some rescaled version of the word metric, and so forth.) Actually for technical reasons one would like to work with a slightly weaker notion than a pseudo-metric, namely a *Bourgain system*, but let us again ignore this technicality here.

Recently, using some powerful tools from model theory combined with the theory of topological groups, Ehud Hrushovski has apparently achieved some breakthroughs on this problem, obtaining new structural control on sets of small doubling in arbitrary groups that was not previously accessible to the known combinatorial methods. The precise results are technical to state, but here are informal versions of two typical theorems. The first applies to sets of small tripling in an arbitrary group:

Theorem 1(Rough version of Hrushovski Theorem 1.1) Let be a set of small tripling, then one can find a long sequence of nested symmetric sets , all of size comparable to and contained in , which are somewhat closed under multiplication in the sense that for all , and which are fairly well closed under commutation in the sense that . (There are also some additional statements to the effect that the efficiently cover each other, and also cover , but I will omit those here.)

This nested sequence is somewhat analogous to a Bourgain system, though it is not quite the same notion.

If one assumes that is “perfect” in a certain sense, which roughly means that there is no non-trivial abelian quotient, then one can do significantly better:

Theorem 2(Rough version of Hrushovski Corollary 1.2) Let be a set of small tripling, let , and suppose that for almost all -tuples (where ), the conjugacy classes generate most of in the sense that . Then a large part of is contained in a subgroup of size comparable to .

Note that if one quotiented out by the commutator , then all of the conjugacy classes would collapse to points. So the hypothesis here is basically a strong quantitative assertion to the effect that the commutator is extremely large, and rapidly fills out most of itself.

Here at UCLA, a group of logicians and I (consisting of Matthias Aschenbrenner, Isaac Goldbring, Greg Hjorth, Henry Towsner, Anush Tserunyan, and possibly others) have just started a weekly reading seminar to come to grips with the various combinatorial, logical, and group-theoretic notions in Hrushovski’s paper, of which we only have a partial understanding at present. The seminar is a physical one, rather than an online one, but I am going to try to put some notes on the seminar on this blog as it progresses, as I know that there are a couple of other mathematicians who are interested in these developments.

So far there have been two meetings of the seminar. In the first, I surveyed the state of knowledge of the noncommutative Freiman theorem, covering broadly the material in my previous blog post. In the second meeting, Isaac reviewed some key notions of model theory used in Hrushovski’s paper, in particular the notions of definability and type, which I will review below. It is not yet clear how these are going to be connected with the combinatorial side of things, but this is something which we will hopefully develop in future seminars. The near-term objective is to understand the statement of the main theorem on the model-theoretic side (Theorem 3.4 of Hrushovski), and then understand some of its easier combinatorial consequences, before going back and trying to understand the *proof* of that theorem.

[*Update*, Oct 19: Given the level of interest in this paper, readers are encouraged to discuss any aspect of that paper in the comments below, even if they are not currently being covered by the UCLA seminar.]

## Recent Comments