A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:
The proof uses graph-theoretic techniques. Setting , we obtain a useful corollary: if has small doubling in the sense that , then we have for all , where is the sum of copies of .
In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as in are replaced with discrete random variables taking values in , and (the logarithm of) cardinality of a set is replaced by Shannon entropy of a random variable . (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.
In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set to a subset , but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if , then for all , where are independent copies of ; this improves slightly over the analogous combinatorial inequality. Indeed, the function is concave (this can be seen by using the version of Theorem 2 (or (2) below) to show that the quantity is decreasing in ).
Theorem 2 is actually a quick consequence of the submodularity inequality
in information theory, which is valid whenever are discrete random variables such that and each determine (i.e. is a function of , and also a function of ), and and jointly determine (i.e is a function of and ). To apply this, let be independent discrete random variables taking values in . Observe that the pairs and each determine , and jointly determine . Applying (1) we conclude that
and Theorem 2 follows by telescoping series.
The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of than ; see this paper and this paper respectively.
It is also worth remarking that the above inequalities largely carry over to the non-abelian setting. For instance, if are iid copies of a discrete random variable in a multiplicative group , the above arguments show that the function is concave. In particular, the expression decreases monotonically to a limit, the asymptotic entropy . This quantity plays an important role in the theory of bounded harmonic functions on , as observed by Kaimonovich and Vershik:
Proposition 3 Let be a discrete group, and let be a discrete random variable in with finite entropy, whose support generates . Then there exists a non-constant bounded function which is harmonic with respect to (which means that for all ) if and only if .
Proof: (Sketch) Suppose first that , then we see from concavity that the successive differences converge to zero. From this it is not hard to see that the mutual information
goes to zero as . Informally, knowing the value of reveals very little about the value of when is large.
Now let be a bounded harmonic function, and let be large. For any and any value in the support of , we observe from harmonicity that
But from the asymptotic vanishing of mutual information and the boundedness of , one can show that the right-hand side will converge to , which by harmonicity is equal to . Thus is invariant with respect to the support of , and is thus constant since this support generates .
Conversely, if is non-zero, then the above arguments show that stays bounded away from zero as , thus reveals a non-trivial amount of information about . This turns out to be true even if is not deterministic, but is itself random, varying over some medium-sized range. From this, one can find a bounded function such that the conditional expectation varies non-trivially with . On the other hand, the bounded function is approximately harmonic (because we are varying ), and has some non-trivial fluctuation near the identity (by the preceding sentence). Taking a limit as (using Arzelá-Ascoli) we obtain a non-constant bounded harmonic function as desired.