You are currently browsing the tag archive for the ‘metric entropy’ tag.

I have uploaded to the arXiv my paper “Exploring the toolkit of Jean Bourgain“. This is one of a collection of papers to be published in the Bulletin of the American Mathematical Society describing aspects of the work of Jean Bourgain; other contributors to this collection include Keith Ball, Ciprian Demeter, and Carlos Kenig. Because the other contributors will be covering specific areas of Jean’s work in some detail, I decided to take a non-overlapping tack, and focus instead on some basic tools of Jean that he frequently used across many of the fields he contributed to. Jean had a surprising number of these “basic tools” that he wielded with great dexterity, and in this paper I focus on just a few of them:

  • Reducing qualitative analysis results (e.g., convergence theorems or dimension bounds) to quantitative analysis estimates (e.g., variational inequalities or maximal function estimates).
  • Using dyadic pigeonholing to locate good scales to work in or to apply truncations.
  • Using random translations to amplify small sets (low density) into large sets (positive density).
  • Combining large deviation inequalities with metric entropy bounds to control suprema of various random processes.

Each of these techniques is individually not too difficult to explain, and were certainly employed on occasion by various mathematicians prior to Bourgain’s work; but Jean had internalized them to the point where he would instinctively use them as soon as they became relevant to a given problem at hand. I illustrate this at the end of the paper with an exposition of one particular result of Jean, on the Erdős similarity problem, in which his main result (that any sum {S = S_1+S_2+S_3} of three infinite sets of reals has the property that there exists a positive measure set {E} that does not contain any homothetic copy {x+tS} of {S}) is basically proven by a sequential application of these tools (except for dyadic pigeonholing, which turns out not to be needed here).

I had initially intended to also cover some other basic tools in Jean’s toolkit, such as the uncertainty principle and the use of probabilistic decoupling, but was having trouble keeping the paper coherent with such a broad focus (certainly I could not identify a single paper of Jean’s that employed all of these tools at once). I hope though that the examples given in the paper gives some reasonable impression of Jean’s research style.

A core foundation of the subject now known as arithmetic combinatorics (and particularly the subfield of additive combinatorics) are the elementary sum set estimates (sometimes known as “Ruzsa calculus”) that relate the cardinality of various sum sets

\displaystyle  A+B := \{ a+b: a \in A, b \in B \}

and difference sets

\displaystyle  A-B := \{ a-b: a \in A, b \in B \},

as well as iterated sumsets such as {3A=A+A+A}, {2A-2A=A+A-A-A}, and so forth. Here, {A, B} are finite non-empty subsets of some additive group {G = (G,+)} (classically one took {G={\bf Z}} or {G={\bf R}}, but nowadays one usually considers more general additive groups). Some basic estimates in this vein are the following:

Lemma 1 (Ruzsa covering lemma) Let {A, B} be finite non-empty subsets of {G}. Then {A} may be covered by at most {\frac{|A+B|}{|B|}} translates of {B-B}.

Proof: Consider a maximal set of disjoint translates {a+B} of {B} by elements {a \in A}. These translates have cardinality {|B|}, are disjoint, and lie in {A+B}, so there are at most {\frac{|A+B|}{|B|}} of them. By maximality, for any {a' \in A}, {a'+B} must intersect at least one of the selected {a+B}, thus {a' \in a+B-B}, and the claim follows. \Box

Lemma 2 (Ruzsa triangle inequality) Let {A,B,C} be finite non-empty subsets of {G}. Then {|A-C| \leq \frac{|A-B| |B-C|}{|B|}}.

Proof: Consider the addition map {+: (x,y) \mapsto x+y} from {(A-B) \times (B-C)} to {G}. Every element {a-c} of {A - C} has a preimage {\{ (x,y) \in (A-B) \times (B-C)\}} of this map of cardinality at least {|B|}, thanks to the obvious identity {a-c = (a-b) + (b-c)} for each {b \in B}. Since {(A-B) \times (B-C)} has cardinality {|A-B| |B-C|}, the claim follows. \Box

Such estimates (which are covered, incidentally, in Section 2 of my book with Van Vu) are particularly useful for controlling finite sets {A} of small doubling, in the sense that {|A+A| \leq K|A|} for some bounded {K}. (There are deeper theorems, most notably Freiman’s theorem, which give more control than what elementary Ruzsa calculus does, however the known bounds in the latter theorem are worse than polynomial in {K} (although it is conjectured otherwise), whereas the elementary estimates are almost all polynomial in {K}.)

However, there are some settings in which the standard sum set estimates are not quite applicable. One such setting is the continuous setting, where one is dealing with bounded open sets in an additive Lie group (e.g. {{\bf R}^n} or a torus {({\bf R}/{\bf Z})^n}) rather than a finite setting. Here, one can largely replicate the discrete sum set estimates by working with a Haar measure in place of cardinality; this is the approach taken for instance in this paper of mine. However, there is another setting, which one might dub the “discretised” setting (as opposed to the “discrete” setting or “continuous” setting), in which the sets {A} remain finite (or at least discretisable to be finite), but for which there is a certain amount of “roundoff error” coming from the discretisation. As a typical example (working now in a non-commutative multiplicative setting rather than an additive one), consider the orthogonal group {O_n({\bf R})} of orthogonal {n \times n} matrices, and let {A} be the matrices obtained by starting with all of the orthogonal matrice in {O_n({\bf R})} and rounding each coefficient of each matrix in this set to the nearest multiple of {\epsilon}, for some small {\epsilon>0}. This forms a finite set (whose cardinality grows as {\epsilon\rightarrow 0} like a certain negative power of {\epsilon}). In the limit {\epsilon \rightarrow 0}, the set {A} is not a set of small doubling in the discrete sense. However, {A \cdot A} is still close to {A} in a metric sense, being contained in the {O_n(\epsilon)}-neighbourhood of {A}. Another key example comes from graphs {\Gamma := \{ (x, f(x)): x \in G \}} of maps {f: A \rightarrow H} from a subset {A} of one additive group {G = (G,+)} to another {H = (H,+)}. If {f} is “approximately additive” in the sense that for all {x,y \in G}, {f(x+y)} is close to {f(x)+f(y)} in some metric, then {\Gamma} might not have small doubling in the discrete sense (because {f(x+y)-f(x)-f(y)} could take a large number of values), but could be considered a set of small doubling in a discretised sense.

One would like to have a sum set (or product set) theory that can handle these cases, particularly in “high-dimensional” settings in which the standard methods of passing back and forth between continuous, discrete, or discretised settings behave poorly from a quantitative point of view due to the exponentially large doubling constant of balls. One way to do this is to impose a translation invariant metric {d} on the underlying group {G = (G,+)} (reverting back to additive notation), and replace the notion of cardinality by that of metric entropy. There are a number of almost equivalent ways to define this concept:

Definition 3 Let {(X,d)} be a metric space, let {E} be a subset of {X}, and let {r>0} be a radius.

  • The packing number {N^{pack}_r(E)} is the largest number of points {x_1,\dots,x_n} one can pack inside {E} such that the balls {B(x_1,r),\dots,B(x_n,r)} are disjoint.
  • The internal covering number {N^{int}_r(E)} is the fewest number of points {x_1,\dots,x_n \in E} such that the balls {B(x_1,r),\dots,B(x_n,r)} cover {E}.
  • The external covering number {N^{ext}_r(E)} is the fewest number of points {x_1,\dots,x_n \in X} such that the balls {B(x_1,r),\dots,B(x_n,r)} cover {E}.
  • The metric entropy {N^{ent}_r(E)} is the largest number of points {x_1,\dots,x_n} one can find in {E} that are {r}-separated, thus {d(x_i,x_j) \geq r} for all {i \neq j}.

It is an easy exercise to verify the inequalities

\displaystyle  N^{ent}_{2r}(E) \leq N^{pack}_r(E) \leq N^{ext}_r(E) \leq N^{int}_r(E) \leq N^{ent}_r(E)

for any {r>0}, and that {N^*_r(E)} is non-increasing in {r} and non-decreasing in {E} for the three choices {* = pack,ext,ent} (but monotonicity in {E} can fail for {*=int}!). It turns out that the external covering number {N^{ent}_r(E)} is slightly more convenient than the other notions of metric entropy, so we will abbreviate {N_r(E) = N^{ent}_r(E)}. The cardinality {|E|} can be viewed as the limit of the entropies {N^*_r(E)} as {r \rightarrow 0}.

If we have the bounded doubling property that {B(0,2r)} is covered by {O(1)} translates of {B(0,r)} for each {r>0}, and one has a Haar measure {m} on {G} which assigns a positive finite mass to each ball, then any of the above entropies {N^*_r(E)} is comparable to {m( E + B(0,r) ) / m(B(0,r))}, as can be seen by simple volume packing arguments. Thus in the bounded doubling setting one can usually use the measure-theoretic sum set theory to derive entropy-theoretic sumset bounds (see e.g. this paper of mine for an example of this). However, it turns out that even in the absence of bounded doubling, one still has an entropy analogue of most of the elementary sum set theory, except that one has to accept some degradation in the radius parameter {r} by some absolute constant. Such losses can be acceptable in applications in which the underlying sets {A} are largely “transverse” to the balls {B(0,r)}, so that the {N_r}-entropy of {A} is largely independent of {A}; this is a situation which arises in particular in the case of graphs {\Gamma = \{ (x,f(x)): x \in G \}} discussed above, if one works with “vertical” metrics whose balls extend primarily in the vertical direction. (I hope to present a specific application of this type here in the near future.)

Henceforth we work in an additive group {G} equipped with a translation-invariant metric {d}. (One can also generalise things slightly by allowing the metric to attain the values {0} or {+\infty}, without changing much of the analysis below.) By the Heine-Borel theorem, any precompact set {E} will have finite entropy {N_r(E)} for any {r>0}. We now have analogues of the two basic Ruzsa lemmas above:

Lemma 4 (Ruzsa covering lemma) Let {A, B} be precompact non-empty subsets of {G}, and let {r>0}. Then {A} may be covered by at most {\frac{N_r(A+B)}{N_r(B)}} translates of {B-B+B(0,2r)}.

Proof: Let {a_1,\dots,a_n \in A} be a maximal set of points such that the sets {a_i + B + B(0,r)} are all disjoint. Then the sets {a_i+B} are disjoint in {A+B} and have entropy {N_r(a_i+B)=N_r(B)}, and furthermore any ball of radius {r} can intersect at most one of the {a_i+B}. We conclude that {N_r(A+B) \geq n N_r(B)}, so {n \leq \frac{N_r(A+B)}{N_r(B)}}. If {a \in A}, then {a+B+B(0,r)} must intersect one of the {a_i + B + B(0,r)}, so {a \in a_i + B-B + B(0,2r)}, and the claim follows. \Box

Lemma 5 (Ruzsa triangle inequality) Let {A,B,C} be precompact non-empty subsets of {G}, and let {r>0}. Then {N_{4r}(A-C) \leq \frac{N_r(A-B) N_r(B-C)}{N_r(B)}}.

Proof: Consider the addition map {+: (x,y) \mapsto x+y} from {(A-B) \times (B-C)} to {G}. The domain {(A-B) \times (B-C)} may be covered by {N_r(A-B) N_r(B-C)} product balls {B(x,r) \times B(y,r)}. Every element {a-c} of {A - C} has a preimage {\{ (x,y) \in (A-B) \times (B-C)\}} of this map which projects to a translate of {B}, and thus must meet at least {N_r(B)} of these product balls. However, if two elements of {A-C} are separated by a distance of at least {4r}, then no product ball can intersect both preimages. We thus see that {N_{4r}^{ent}(A-C) \leq \frac{N_r(A-B) N_r(B-C)}{N_r(A-C)}}, and the claim follows. \Box

Below the fold we will record some further metric entropy analogues of sum set estimates (basically redoing much of Chapter 2 of my book with Van Vu). Unfortunately there does not seem to be a direct way to abstractly deduce metric entropy results from their sum set analogues (basically due to the failure of a certain strong version of Freiman’s theorem, as discussed in this previous post); nevertheless, the proofs of the discrete arguments are elementary enough that they can be modified with a small amount of effort to handle the entropy case. (In fact, there should be a very general model-theoretic framework in which both the discrete and entropy arguments can be processed in a unified manner; see this paper of Hrushovski for one such framework.)

It is also likely that many of the arguments here extend to the non-commutative setting, but for simplicity we will not pursue such generalisations here.

Read the rest of this entry »

Archives