You are currently browsing the tag archive for the ‘sum set estimates’ tag.
A core foundation of the subject now known as arithmetic combinatorics (and particularly the subfield of additive combinatorics) are the elementary sum set estimates (sometimes known as “Ruzsa calculus”) that relate the cardinality of various sum sets
and difference sets
as well as iterated sumsets such as , , and so forth. Here, are finite non-empty subsets of some additive group (classically one took or , but nowadays one usually considers more general additive groups). Some basic estimates in this vein are the following:
Lemma 1 (Ruzsa covering lemma) Let be finite non-empty subsets of . Then may be covered by at most translates of .
Proof: Consider a maximal set of disjoint translates of by elements . These translates have cardinality , are disjoint, and lie in , so there are at most of them. By maximality, for any , must intersect at least one of the selected , thus , and the claim follows.
Lemma 2 (Ruzsa triangle inequality) Let be finite non-empty subsets of . Then .
Proof: Consider the addition map from to . Every element of has a preimage of this map of cardinality at least , thanks to the obvious identity for each . Since has cardinality , the claim follows.
Such estimates (which are covered, incidentally, in Section 2 of my book with Van Vu) are particularly useful for controlling finite sets of small doubling, in the sense that for some bounded . (There are deeper theorems, most notably Freiman’s theorem, which give more control than what elementary Ruzsa calculus does, however the known bounds in the latter theorem are worse than polynomial in (although it is conjectured otherwise), whereas the elementary estimates are almost all polynomial in .)
However, there are some settings in which the standard sum set estimates are not quite applicable. One such setting is the continuous setting, where one is dealing with bounded open sets in an additive Lie group (e.g. or a torus ) rather than a finite setting. Here, one can largely replicate the discrete sum set estimates by working with a Haar measure in place of cardinality; this is the approach taken for instance in this paper of mine. However, there is another setting, which one might dub the “discretised” setting (as opposed to the “discrete” setting or “continuous” setting), in which the sets remain finite (or at least discretisable to be finite), but for which there is a certain amount of “roundoff error” coming from the discretisation. As a typical example (working now in a non-commutative multiplicative setting rather than an additive one), consider the orthogonal group of orthogonal matrices, and let be the matrices obtained by starting with all of the orthogonal matrice in and rounding each coefficient of each matrix in this set to the nearest multiple of , for some small . This forms a finite set (whose cardinality grows as like a certain negative power of ). In the limit , the set is not a set of small doubling in the discrete sense. However, is still close to in a metric sense, being contained in the -neighbourhood of . Another key example comes from graphs of maps from a subset of one additive group to another . If is “approximately additive” in the sense that for all , is close to in some metric, then might not have small doubling in the discrete sense (because could take a large number of values), but could be considered a set of small doubling in a discretised sense.
One would like to have a sum set (or product set) theory that can handle these cases, particularly in “high-dimensional” settings in which the standard methods of passing back and forth between continuous, discrete, or discretised settings behave poorly from a quantitative point of view due to the exponentially large doubling constant of balls. One way to do this is to impose a translation invariant metric on the underlying group (reverting back to additive notation), and replace the notion of cardinality by that of metric entropy. There are a number of almost equivalent ways to define this concept:
Definition 3 Let be a metric space, let be a subset of , and let be a radius.
- The packing number is the largest number of points one can pack inside such that the balls are disjoint.
- The internal covering number is the fewest number of points such that the balls cover .
- The external covering number is the fewest number of points such that the balls cover .
- The metric entropy is the largest number of points one can find in that are -separated, thus for all .
It is an easy exercise to verify the inequalities
for any , and that is non-increasing in and non-decreasing in for the three choices (but monotonicity in can fail for !). It turns out that the external covering number is slightly more convenient than the other notions of metric entropy, so we will abbreviate . The cardinality can be viewed as the limit of the entropies as .
If we have the bounded doubling property that is covered by translates of for each , and one has a Haar measure on which assigns a positive finite mass to each ball, then any of the above entropies is comparable to , as can be seen by simple volume packing arguments. Thus in the bounded doubling setting one can usually use the measure-theoretic sum set theory to derive entropy-theoretic sumset bounds (see e.g. this paper of mine for an example of this). However, it turns out that even in the absence of bounded doubling, one still has an entropy analogue of most of the elementary sum set theory, except that one has to accept some degradation in the radius parameter by some absolute constant. Such losses can be acceptable in applications in which the underlying sets are largely “transverse” to the balls , so that the -entropy of is largely independent of ; this is a situation which arises in particular in the case of graphs discussed above, if one works with “vertical” metrics whose balls extend primarily in the vertical direction. (I hope to present a specific application of this type here in the near future.)
Henceforth we work in an additive group equipped with a translation-invariant metric . (One can also generalise things slightly by allowing the metric to attain the values or , without changing much of the analysis below.) By the Heine-Borel theorem, any precompact set will have finite entropy for any . We now have analogues of the two basic Ruzsa lemmas above:
Proof: Let be a maximal set of points such that the sets are all disjoint. Then the sets are disjoint in and have entropy , and furthermore any ball of radius can intersect at most one of the . We conclude that , so . If , then must intersect one of the , so , and the claim follows.
Proof: Consider the addition map from to . The domain may be covered by product balls . Every element of has a preimage of this map which projects to a translate of , and thus must meet at least of these product balls. However, if two elements of are separated by a distance of at least , then no product ball can intersect both preimages. We thus see that , and the claim follows.
Below the fold we will record some further metric entropy analogues of sum set estimates (basically redoing much of Chapter 2 of my book with Van Vu). Unfortunately there does not seem to be a direct way to abstractly deduce metric entropy results from their sum set analogues (basically due to the failure of a certain strong version of Freiman’s theorem, as discussed in this previous post); nevertheless, the proofs of the discrete arguments are elementary enough that they can be modified with a small amount of effort to handle the entropy case. (In fact, there should be a very general model-theoretic framework in which both the discrete and entropy arguments can be processed in a unified manner; see this paper of Hrushovski for one such framework.)
It is also likely that many of the arguments here extend to the non-commutative setting, but for simplicity we will not pursue such generalisations here.
It turns out to be a favourable week or two for me to finally finish a number of papers that had been at a nearly completed stage for a while. I have just uploaded to the arXiv my article “Sumset and inverse sumset theorems for Shannon entropy“, submitted to Combinatorics, Probability, and Computing. This paper evolved from a “deleted scene” in my book with Van Vu entitled “Entropy sumset estimates“. In those notes, we developed analogues of the standard Plünnecke-Ruzsa sumset estimates (which relate quantities such as the cardinalities of the sum and difference sets of two finite sets in an additive group to each other), to the entropy setting, in which the finite sets are replaced instead with discrete random variables taking values in that group G, and the (logarithm of the) cardinality |A| is replaced with the Shannon entropy
This quantity measures the information content of X; for instance, if , then it will take k bits on the average to store the value of X (thus a string of n independent copies of X will require about nk bits of storage in the asymptotic limit ). The relationship between entropy and cardinality is that if X is the uniform distribution on a finite non-empty set A, then . If instead X is non-uniformly distributed on A, one has , thanks to Jensen’s inequality.
It turns out that many estimates on sumsets have entropy analogues, which resemble the “logarithm” of the sumset estimates. For instance, the trivial bounds
have the entropy analogue
whenever X, Y are independent discrete random variables in an additive group; this is not difficult to deduce from standard entropy inequalities. Slightly more non-trivially, the sum set estimate
established by Ruzsa, has an entropy analogue
and similarly for a number of other standard sumset inequalities in the literature (e.g. the Rusza triangle inequality, the Plünnecke-Rusza inequality, and the Balog-Szemeredi-Gowers theorem, though the entropy analogue of the latter requires a little bit of care to state). These inequalities can actually be deduced fairly easily from elementary arithmetic identities, together with standard entropy inequalities, most notably the submodularity inequality
whenever X,Y,Z,W are discrete random variables such that X and Y each determine W separately (thus for some deterministic functions f, g) and X and Y determine Z jointly (thus for some deterministic function f). For instance, if X,Y,Z are independent discrete random variables in an additive group G, then and each determine separately, and determine jointly, leading to the inequality
which soon leads to the entropy Rusza triangle inequality
which is an analogue of the combinatorial Ruzsa triangle inequality
All of this was already in the unpublished notes with Van, though I include it in this paper in order to place it in the literature. The main novelty of the paper, though, is to consider the entropy analogue of Freiman’s theorem, which classifies those sets A for which . Here, the analogous problem is to classify the random variables such that , where are independent copies of X. Let us say that X has small doubling if this is the case.
For instance, the uniform distribution U on a finite subgroup H of G has small doubling (in fact in this case). In a similar spirit, the uniform distribution on a (generalised) arithmetic progression P also has small doubling, as does the uniform distribution on a coset progression H+P. Also, if X has small doubling, and Y has bounded entropy, then X+Y also has small doubling, even if Y and X are not independent. The main theorem is that these are the only cases:
Theorem 1. (Informal statement) X has small doubling if and only if for some uniform distribution U on a coset progression (of bounded rank), and Y has bounded entropy.
For instance, suppose that X was the uniform distribution on a dense subset A of a finite group G. Then Theorem 1 asserts that X is close in a “transport metric” sense to the uniform distribution U on G, in the sense that it is possible to rearrange or transport the probability distribution of X to the probability distribution of U (or vice versa) by shifting each component of the mass of X by an amount Y which has bounded entropy (which basically means that it primarily ranges inside a set of bounded cardinality). The way one shows this is by randomly translating the mass of X around by a few random shifts to approximately uniformise the distribution, and then deal with the residual fluctuation in the distribution by hand. Theorem 1 as a whole is established by using the Freiman theorem in the combinatorial setting combined with various elementary convexity and entropy inequality arguments to reduce matters to the above model case when X is supported inside a finite group G and has near-maximal entropy.
I also show a variant of the above statement: if X, Y are independent and , then we have (i.e. X has the same distribution as Y+Z for some Z of bounded entropy (not necessarily independent of X or Y). Thus if two random variables are additively related to each other, then they can be additively transported to each other by using a bounded amount of entropy.
In the last part of the paper I relate these discrete entropies to their continuous counterparts
where X is now a continuous random variable on the real line with density function . There are a number of sum set inequalities known in this setting, for instance
for independent copies of a finite entropy random variable X, with equality if and only if X is a Gaussian. Using this inequality and Theorem 1, I show a discrete version, namely that
whenever and are independent copies of a random variable in (or any other torsion-free abelian group) whose entropy is sufficiently large depending on . This is somewhat analogous to the classical sumset inequality
though notice that we have a gain of just rather than here, the point being that there is a Gaussian counterexample in the entropy setting which does not have a combinatorial analogue (except perhaps in the high-dimensional limit). The main idea is to use Theorem 1 to trap most of X inside a coset progression, at which point one can use Fourier-analytic additive combinatorial tools to show that the distribution is “smooth” in some non-trivial direction r, which can then be used to approximate the discrete distribution by a continuous one.
I also conjecture more generally that the entropy monotonicity inequalities established by Artstein, Barthe, Ball, and Naor in the continuous case also hold in the above sense in the discrete case, though my method of proof breaks down because I no longer can assume small doubling.