A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:
Theorem 1 (Plünnecke-Ruzsa inequality) Let
be finite non-empty subsets of an additive group
, such that
for all
and some scalars
. Then there exists a subset
of
such that
.
The proof uses graph-theoretic techniques. Setting , we obtain a useful corollary: if
has small doubling in the sense that
, then we have
for all
, where
is the sum of
copies of
.
In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as in
are replaced with discrete random variables
taking values in
, and (the logarithm of) cardinality
of a set
is replaced by Shannon entropy
of a random variable
. (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.
I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:
Theorem 2 (Entropy Plünnecke-Ruzsa inequality) Let
be independent random variables of finite entropy taking values in an additive group
, such that
for all
and some scalars
. Then
.
In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set to a subset
, but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if
, then
for all
, where
are independent copies of
; this improves slightly over the analogous combinatorial inequality. Indeed, the function
is concave (this can be seen by using the
version of Theorem 2 (or (2) below) to show that the quantity
is decreasing in
).
Theorem 2 is actually a quick consequence of the submodularity inequality
in information theory, which is valid whenever are discrete random variables such that
and
each determine
(i.e.
is a function of
, and also a function of
), and
and
jointly determine
(i.e
is a function of
and
). To apply this, let
be independent discrete random variables taking values in
. Observe that the pairs
and
each determine
, and jointly determine
. Applying (1) we conclude that
which after using the independence of simplifies to the sumset submodularity inequality
(this inequality was also recently observed by Madiman; it is the case of Theorem 2). As a corollary of this inequality, we see that if
, then
and Theorem 2 follows by telescoping series.
The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of than
; see this paper and this paper respectively.
It is also worth remarking that the above inequalities largely carry over to the non-abelian setting. For instance, if are iid copies of a discrete random variable in a multiplicative group
, the above arguments show that the function
is concave. In particular, the expression
decreases monotonically to a limit, the asymptotic entropy
. This quantity plays an important role in the theory of bounded harmonic functions on
, as observed by Kaimonovich and Vershik:
Proposition 3 Let
be a discrete group, and let
be a discrete random variable in
with finite entropy, whose support generates
. Then there exists a non-constant bounded function
which is harmonic with respect to
(which means that
for all
) if and only if
.
Proof: (Sketch) Suppose first that , then we see from concavity that the successive differences
converge to zero. From this it is not hard to see that the mutual information
goes to zero as . Informally, knowing the value of
reveals very little about the value of
when
is large.
Now let be a bounded harmonic function, and let
be large. For any
and any value
in the support of
, we observe from harmonicity that
But from the asymptotic vanishing of mutual information and the boundedness of , one can show that the right-hand side will converge to
, which by harmonicity is equal to
. Thus
is invariant with respect to the support of
, and is thus constant since this support generates
.
Conversely, if is non-zero, then the above arguments show that
stays bounded away from zero as
, thus
reveals a non-trivial amount of information about
. This turns out to be true even if
is not deterministic, but is itself random, varying over some medium-sized range. From this, one can find a bounded function
such that the conditional expectation
varies non-trivially with
. On the other hand, the bounded function
is approximately harmonic (because we are varying
), and has some non-trivial fluctuation near the identity (by the preceding sentence). Taking a limit as
(using Arzelá-Ascoli) we obtain a non-constant bounded harmonic function as desired.
11 comments
Comments feed for this article
28 October, 2009 at 6:47 am
Anonymous
Dear Prof. Tao,
What is the definition of
?
thanks
28 October, 2009 at 9:10 am
Terence Tao
This is the Shannon entropy of X; I’ve updated the text slightly to emphasise this.
28 October, 2009 at 12:38 pm
Seva
Dear Terry,
1) It is my understanding that nobody presently knows how to deduce Theorem 1 from Theorem 2 or vice versa; but do you believe that such deductions exist somewhere in The Book?
2) Could you explain why is
a concave function and how this fact is related to Theorem 2?
3) Two typos: Kaimonovich is misspelled (both times), and the parenthetical note three lines below equation (1) should read “
is a function of
and
“.
Thanks for the post!
28 October, 2009 at 12:55 pm
Terence Tao
Thanks for the corrections! The concavity comes from the m=2 case of Theorem 2 (i.e. the inequality (2)); I’ve updated the post to sketch the proof.
Regarding the connection between combinatorial and entropy sum set estimates, the key difference is that one can perturb random variables a little bit without affecting the entropy of those variables or of their sum, but if one adds or removes a few elements from a set, the sumset can change dramatically. Roughly speaking, it seems that entropy sumset estimates correspond to combinatorial estimates for the “essential” component of the sumsets A+B, which is a loosely defined term but can be thought of as the sums in A+B that have _many_ representations of the form a+b, rather than just at least one. There does seem to be some sort of correspondence between entropy sum set estimates and combinatorial essential sum set estimates by using the tensor power trick, noting that when one takes n iid samples for n large of a discrete random variable X then one approximately gets a uniform distribution on
points (this is the “microstate” formulation of entropy). I haven’t tried to make this correspondence rigorous though.
24 November, 2009 at 4:19 am
Jonathan Fine
Sort of off-topic, and definitely non-mathematical. The formula for Shannon entropy is, from time to time, not parsing.
Here’s the formula (if not mangled in the HTML.
http://l.wordpress.com/latex.php?latex=Bbb+H%28X%29&bg=ffffff&fg=000000&s=0
I suspect that it may be because WordPress has set up several image servers, and they are not all running the same software.
10 February, 2010 at 5:20 pm
Yiannis
hi,
though elementary, the inequality in Theorem 2 is really neat. do you have any applications in mind — any way it can be used, e.g., to prove entropy inequalities analogous to sumset (or inverse sumset) bounds that use the classical Plünnecke inequality?
thanks!
yiannis
10 February, 2010 at 9:25 pm
Terence Tao
Well, if I had known about this inequality in my previous paper on entropy sumset inequalities (op. cit.), then that would have shortened the proofs and tightened the constants a little bit in some places, though the gain is not too dramatic because there were other proofs of Plunnecke-type inequalities in the literature which already extended to the entropy setting.
Every so often I get tempted to do some sort of entropy gradient flow or something to try to prove Freiman’s theorem by entropy methods, but no luck thus far – it’s one thing to wish for a useful monotonicity formula for a gradient flow, but it’s another to actually find one…
18 April, 2013 at 4:36 pm
máté wierdl
I think there is small typo in the beginning: in the third line below the first framed result (Ruzsa-Plünnecke), you write $|mA|\le K^m|A|$ when it should be $|mA|\le K^{m-1}|A|$
18 April, 2013 at 5:18 pm
Terence Tao
Unfortunately, the more natural bound
, while possibly true, is not easily obtained from the Plunnecke-Ruzsa inequality, which instead gives that
for some non-empty subset
of
, and then one has to use the crude inequality
to conclude. I think the
bound remains open (even with the simplified new proof of Plunnecke-Ruzsa due to Petridis).
18 February, 2020 at 4:22 am
George Shakan
Imre Ruzsa showed me a counterexample to this. Our set will be in
. Take
.
, and
.
Then
17 June, 2018 at 4:44 am
Covering and Plünnecke’s inequality | George Shakan
[…] should be compared to this post of Tao, where he talks about an entropy version of Pl”{u}nnecke’s […]