A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:

Theorem 1 (Plünnecke-Ruzsa inequality)Let be finite non-empty subsets of an additive group , such that for all and some scalars . Then there exists a subset of such that .

The proof uses graph-theoretic techniques. Setting , we obtain a useful corollary: if has small doubling in the sense that , then we have for all , where is the sum of copies of .

In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as in are replaced with discrete random variables taking values in , and (the logarithm of) cardinality of a set is replaced by Shannon entropy of a random variable . (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.

I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:

Theorem 2 (Entropy Plünnecke-Ruzsa inequality)Let be independent random variables of finite entropy taking values in an additive group , such that for all and some scalars . Then .

In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set to a subset , but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if , then for all , where are independent copies of ; this improves slightly over the analogous combinatorial inequality. Indeed, the function is concave (this can be seen by using the version of Theorem 2 (or (2) below) to show that the quantity is decreasing in ).

Theorem 2 is actually a quick consequence of the *submodularity inequality*

in information theory, which is valid whenever are discrete random variables such that and each determine (i.e. is a function of , and also a function of ), and and jointly determine (i.e is a function of and ). To apply this, let be independent discrete random variables taking values in . Observe that the pairs and each determine , and jointly determine . Applying (1) we conclude that

which after using the independence of simplifies to the *sumset submodularity inequality*

(this inequality was also recently observed by Madiman; it is the case of Theorem 2). As a corollary of this inequality, we see that if , then

and Theorem 2 follows by telescoping series.

The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of than ; see this paper and this paper respectively.

It is also worth remarking that the above inequalities largely carry over to the non-abelian setting. For instance, if are iid copies of a discrete random variable in a multiplicative group , the above arguments show that the function is concave. In particular, the expression decreases monotonically to a limit, the *asymptotic entropy* . This quantity plays an important role in the theory of bounded harmonic functions on , as observed by Kaimonovich and Vershik:

Proposition 3Let be a discrete group, and let be a discrete random variable in with finite entropy, whose support generates . Then there exists a non-constant bounded function which is harmonic with respect to (which means that for all ) if and only if .

*Proof:* (Sketch) Suppose first that , then we see from concavity that the successive differences converge to zero. From this it is not hard to see that the mutual information

goes to zero as . Informally, knowing the value of reveals very little about the value of when is large.

Now let be a bounded harmonic function, and let be large. For any and any value in the support of , we observe from harmonicity that

But from the asymptotic vanishing of mutual information and the boundedness of , one can show that the right-hand side will converge to , which by harmonicity is equal to . Thus is invariant with respect to the support of , and is thus constant since this support generates .

Conversely, if is non-zero, then the above arguments show that stays bounded away from zero as , thus reveals a non-trivial amount of information about . This turns out to be true even if is not deterministic, but is itself random, varying over some medium-sized range. From this, one can find a bounded function such that the conditional expectation varies non-trivially with . On the other hand, the bounded function is approximately harmonic (because we are varying ), and has some non-trivial fluctuation near the identity (by the preceding sentence). Taking a limit as (using Arzelá-Ascoli) we obtain a non-constant bounded harmonic function as desired.

## 10 comments

Comments feed for this article

28 October, 2009 at 6:47 am

AnonymousDear Prof. Tao,

What is the definition of ?

thanks

28 October, 2009 at 9:10 am

Terence TaoThis is the Shannon entropy of X; I’ve updated the text slightly to emphasise this.

28 October, 2009 at 12:38 pm

SevaDear Terry,

1) It is my understanding that nobody presently knows how to deduce Theorem 1 from Theorem 2 or vice versa; but do you believe that such deductions exist somewhere in The Book?

2) Could you explain why is a concave function and how this fact is related to Theorem 2?

3) Two typos: Kaimonovich is misspelled (both times), and the parenthetical note three lines below equation (1) should read “ is a function of and “.

Thanks for the post!

28 October, 2009 at 12:55 pm

Terence TaoThanks for the corrections! The concavity comes from the m=2 case of Theorem 2 (i.e. the inequality (2)); I’ve updated the post to sketch the proof.

Regarding the connection between combinatorial and entropy sum set estimates, the key difference is that one can perturb random variables a little bit without affecting the entropy of those variables or of their sum, but if one adds or removes a few elements from a set, the sumset can change dramatically. Roughly speaking, it seems that entropy sumset estimates correspond to combinatorial estimates for the “essential” component of the sumsets A+B, which is a loosely defined term but can be thought of as the sums in A+B that have _many_ representations of the form a+b, rather than just at least one. There does seem to be some sort of correspondence between entropy sum set estimates and combinatorial essential sum set estimates by using the tensor power trick, noting that when one takes n iid samples for n large of a discrete random variable X then one approximately gets a uniform distribution on points (this is the “microstate” formulation of entropy). I haven’t tried to make this correspondence rigorous though.

24 November, 2009 at 4:19 am

Jonathan FineSort of off-topic, and definitely non-mathematical. The formula for Shannon entropy is, from time to time, not parsing.

Here’s the formula (if not mangled in the HTML.

http://l.wordpress.com/latex.php?latex=Bbb+H%28X%29&bg=ffffff&fg=000000&s=0

I suspect that it may be because WordPress has set up several image servers, and they are not all running the same software.

10 February, 2010 at 5:20 pm

Yiannishi,

though elementary, the inequality in Theorem 2 is really neat. do you have any applications in mind — any way it can be used, e.g., to prove entropy inequalities analogous to sumset (or inverse sumset) bounds that use the classical Plünnecke inequality?

thanks!

yiannis

10 February, 2010 at 9:25 pm

Terence TaoWell, if I had known about this inequality in my previous paper on entropy sumset inequalities (op. cit.), then that would have shortened the proofs and tightened the constants a little bit in some places, though the gain is not too dramatic because there were other proofs of Plunnecke-type inequalities in the literature which already extended to the entropy setting.

Every so often I get tempted to do some sort of entropy gradient flow or something to try to prove Freiman’s theorem by entropy methods, but no luck thus far – it’s one thing to wish for a useful monotonicity formula for a gradient flow, but it’s another to actually find one…

18 April, 2013 at 4:36 pm

máté wierdlI think there is small typo in the beginning: in the third line below the first framed result (Ruzsa-Plünnecke), you write $|mA|\le K^m|A|$ when it should be $|mA|\le K^{m-1}|A|$

18 April, 2013 at 5:18 pm

Terence TaoUnfortunately, the more natural bound , while possibly true, is not easily obtained from the Plunnecke-Ruzsa inequality, which instead gives that for some non-empty subset $A’$ of $A$, and then one has to use the crude inequality to conclude. I think the bound remains open (even with the simplified new proof of Plunnecke-Ruzsa due to Petridis).

17 June, 2018 at 4:44 am

Covering and Plünnecke’s inequality | George Shakan[…] should be compared to this post of Tao, where he talks about an entropy version of Pl”{u}nnecke’s […]