You are currently browsing the tag archive for the ‘Plunnecke-Ruzsa inequality’ tag.

A handy inequality in additive combinatorics is the Plünnecke-Ruzsa inequality:

Theorem 1 (Plünnecke-Ruzsa inequality) Let ${A, B_1, \ldots, B_m}$ be finite non-empty subsets of an additive group ${G}$, such that ${|A+B_i| \leq K_i |A|}$ for all ${1 \leq i \leq m}$ and some scalars ${K_1,\ldots,K_m \geq 1}$. Then there exists a subset ${A'}$ of ${A}$ such that ${|A' + B_1 + \ldots + B_m| \leq K_1 \ldots K_m |A'|}$.

The proof uses graph-theoretic techniques. Setting ${A=B_1=\ldots=B_m}$, we obtain a useful corollary: if ${A}$ has small doubling in the sense that ${|A+A| \leq K|A|}$, then we have ${|mA| \leq K^m |A|}$ for all ${m \geq 1}$, where ${mA = A + \ldots + A}$ is the sum of ${m}$ copies of ${A}$.

In a recent paper, I adapted a number of sum set estimates to the entropy setting, in which finite sets such as ${A}$ in ${G}$ are replaced with discrete random variables ${X}$ taking values in ${G}$, and (the logarithm of) cardinality ${|A|}$ of a set ${A}$ is replaced by Shannon entropy ${{\Bbb H}(X)}$ of a random variable ${X}$. (Throughout this note I assume all entropies to be finite.) However, at the time, I was unable to find an entropy analogue of the Plünnecke-Ruzsa inequality, because I did not know how to adapt the graph theory argument to the entropy setting.

I recently discovered, however, that buried in a classic paper of Kaimonovich and Vershik (implicitly in Proposition 1.3, to be precise) there was the following analogue of Theorem 1:

Theorem 2 (Entropy Plünnecke-Ruzsa inequality) Let ${X, Y_1, \ldots, Y_m}$ be independent random variables of finite entropy taking values in an additive group ${G}$, such that ${{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i}$ for all ${1 \leq i \leq m}$ and some scalars ${K_1,\ldots,K_m \geq 1}$. Then ${{\Bbb H}(X+Y_1+\ldots+Y_m) \leq {\Bbb H}(X) + \log K_1 \ldots K_m}$.

In fact Theorem 2 is a bit “better” than Theorem 1 in the sense that Theorem 1 needed to refine the original set ${A}$ to a subset ${A'}$, but no such refinement is needed in Theorem 2. One corollary of Theorem 2 is that if ${{\Bbb H}(X_1+X_2) \leq {\Bbb H}(X) + \log K}$, then ${{\Bbb H}(X_1+\ldots+X_m) \leq {\Bbb H}(X) + (m-1) \log K}$ for all ${m \geq 1}$, where ${X_1,\ldots,X_m}$ are independent copies of ${X}$; this improves slightly over the analogous combinatorial inequality. Indeed, the function ${m \mapsto {\Bbb H}(X_1+\ldots+X_m)}$ is concave (this can be seen by using the ${m=2}$ version of Theorem 2 (or (2) below) to show that the quantity ${{\Bbb H}(X_1+\ldots+X_{m+1})-{\Bbb H}(X_1+\ldots+X_m)}$ is decreasing in ${m}$).

Theorem 2 is actually a quick consequence of the submodularity inequality $\displaystyle {\Bbb H}(W) + {\Bbb H}(X) \leq {\Bbb H}(Y) + {\Bbb H}(Z) \ \ \ \ \ (1)$

in information theory, which is valid whenever ${X,Y,Z,W}$ are discrete random variables such that ${Y}$ and ${Z}$ each determine ${X}$ (i.e. ${X}$ is a function of ${Y}$, and also a function of ${Z}$), and ${Y}$ and ${Z}$ jointly determine ${W}$ (i.e ${W}$ is a function of ${Y}$ and ${Z}$). To apply this, let ${X, Y, Z}$ be independent discrete random variables taking values in ${G}$. Observe that the pairs ${(X,Y+Z)}$ and ${(X+Y,Z)}$ each determine ${X+Y+Z}$, and jointly determine ${(X,Y,Z)}$. Applying (1) we conclude that $\displaystyle {\Bbb H}(X,Y,Z) + {\Bbb H}(X+Y+Z) \leq {\Bbb H}(X,Y+Z) + {\Bbb H}(X+Y,Z)$

which after using the independence of ${X,Y,Z}$ simplifies to the sumset submodularity inequality $\displaystyle {\Bbb H}(X+Y+Z) + {\Bbb H}(Y) \leq {\Bbb H}(X+Y) + {\Bbb H}(Y+Z) \ \ \ \ \ (2)$

(this inequality was also recently observed by Madiman; it is the ${m=2}$ case of Theorem 2). As a corollary of this inequality, we see that if ${{\Bbb H}(X+Y_i) \leq {\Bbb H}(X) + \log K_i}$, then $\displaystyle {\Bbb H}(X+Y_1+\ldots+Y_i) \leq {\Bbb H}(X+Y_1+\ldots+Y_{i-1}) + \log K_i,$

and Theorem 2 follows by telescoping series.

The proof of Theorem 2 seems to be genuinely different from the graph-theoretic proof of Theorem 1. It would be interesting to see if the above argument can be somehow adapted to give a stronger version of Theorem 1. Note also that both Theorem 1 and Theorem 2 have extensions to more general combinations of ${X,Y_1,\ldots,Y_m}$ than ${X+Y_i}$; see this paper and this paper respectively. Dingjun Bian on Analysis II Barry Hirschfeld on Almost all Collatz orbits atta… Islem Ghaffor on Some slides from Basel Alberto Ibañez on Almost all Collatz orbits atta… Brian on Almost all Collatz orbits atta… mallesham kummari on Gaps between primes mallesham kummari on Szemeredi’s proof of Sze… James Moriarty on Heuristic computation of corre… James Moriarty on Some slides from Basel Terence Tao on Decoupling and the Bourgain-De… Mallesham on A cheap version of Halasz… Cole Green on Eigenvectors from Eigenvalues:… Anonymous on Analysis I Anonymous on Decoupling and the Bourgain-De… Andrew Howe on Some slides from Basel