You are currently browsing the tag archive for the ‘Plunnecke-Ruzsa inequality’ tag.

A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:

Theorem 1 (Plünnecke-Ruzsa inequality) Let {A, B_1, \ldots, B_m} be finite non-empty subsets of an additive group {G}, such that {|A+B_i| \leq K_i |A|} for all {1 \leq i \leq m} and some scalars {K_1,\ldots,K_m \geq 1}. Then there exists a subset {A'} of {A} such that {|A' + B_1 + \ldots + B_m| \leq K_1 \ldots K_m |A'|}.

The proof uses graph-theoretic techniques. Setting {A=B_1=\ldots=B_m}, we obtain a useful corollary: if {A} has small doubling in the sense that {|A+A| \leq K|A|}, then we have {|mA| \leq K^m |A|} for all {m \geq 1}, where {mA = A + \ldots + A} is the sum of {m} copies of {A}.

In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as {A} in {G} are replaced with discrete random variables {X} taking values in {G}, and (the logarithm of) cardinality {|A|} of a set {A} is replaced by Shannon entropy {{\Bbb H}(X)} of a random variable {X}. (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.

I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:

Theorem 2 (Entropy Plünnecke-Ruzsa inequality) Let {X, Y_1, \ldots, Y_m} be independent random variables of finite entropy taking values in an additive group {G}, such that {{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i} for all {1 \leq i \leq m} and some scalars {K_1,\ldots,K_m \geq 1}. Then {{\Bbb H}(X+Y_1+\ldots+Y_m) \leq {\Bbb H}(X) + \log K_1 \ldots K_m}.

In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set {A} to a subset {A'}, but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if {{\Bbb H}(X_1+X_2) \leq {\Bbb H}(X) + \log K}, then {{\Bbb H}(X_1+\ldots+X_m) \leq {\Bbb H}(X) + (m-1) \log K} for all {m \geq 1}, where {X_1,\ldots,X_m} are independent copies of {X}; this improves slightly over the analogous combinatorial inequality. Indeed, the function {m \mapsto {\Bbb H}(X_1+\ldots+X_m)} is concave (this can be seen by using the {m=2} version of Theorem 2 (or (2) below) to show that the quantity {{\Bbb H}(X_1+\ldots+X_{m+1})-{\Bbb H}(X_1+\ldots+X_m)} is decreasing in {m}).

Theorem 2 is actually a quick consequence of the submodularity inequality

\displaystyle  {\Bbb H}(W) + {\Bbb H}(X) \leq {\Bbb H}(Y) + {\Bbb H}(Z) \ \ \ \ \ (1)

in information theory, which is valid whenever {X,Y,Z,W} are discrete random variables such that {Y} and {Z} each determine {X} (i.e. {X} is a function of {Y}, and also a function of {Z}), and {Y} and {Z} jointly determine {W} (i.e {W} is a function of {Y} and {Z}). To apply this, let {X, Y, Z} be independent discrete random variables taking values in {G}. Observe that the pairs {(X,Y+Z)} and {(X+Y,Z)} each determine {X+Y+Z}, and jointly determine {(X,Y,Z)}. Applying (1) we conclude that

\displaystyle  {\Bbb H}(X,Y,Z) + {\Bbb H}(X+Y+Z) \leq {\Bbb H}(X,Y+Z) + {\Bbb H}(X+Y,Z)

which after using the independence of {X,Y,Z} simplifies to the sumset submodularity inequality

\displaystyle  {\Bbb H}(X+Y+Z) + {\Bbb H}(Y) \leq {\Bbb H}(X+Y) + {\Bbb H}(Y+Z) \ \ \ \ \ (2)

(this inequality was also recently observed by Madiman; it is the {m=2} case of Theorem 2). As a corollary of this inequality, we see that if {{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i}, then

\displaystyle  {\Bbb H}(X+Y_1+\ldots+Y_i) \leq {\Bbb H}(X+Y_1+\ldots+Y_{i-1}) + \log K_i,

and Theorem 2 follows by telescoping series.

The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of {X,Y_1,\ldots,Y_m} than {X+Y_i}; see this paper and this paper respectively.

Read the rest of this entry »

Archives