Assaf Naor and I have just uploaded to the arXiv our joint paper “Scale-oblivious metric fragmentation and the nonlinear Dvoretzky theorem“.

Consider a finite metric space {(X,d)}, with {X} being a set of {n} points. It is not always the case that this space can be isometrically embedded into a Hilbert space such as {\ell^2}; for instance, in a Hilbert space every pair of points has a unique midpoint, but uniqueness can certainly fail for finite metric spaces (consider for instance the graph metric on a four-point diamond). The situation improves, however, if one allows (a) the embedding to have some distortion, and (b) one is willing to embed just a subset of {X}, rather than all of {X}. More precisely, for any integer {n} and any distortion {D \geq 1}, let {k(n,D)} be the largest integer with the property that given every {n}-point metric space {(X,d)}, there exists a {k}-point subset {Y} of {X}, and a map {f: Y \rightarrow \ell^2}, which has distortion at most {D} in the sense that

\displaystyle  d(y,y') \leq \| f(y) - f(y') \|_{\ell^2} \leq D d(y,y')

for all {y,y' \in Y}.

Bourgain, Figiel, and Milman established that for any fixed {D>1}, one has the lower bound {k(n,D) \gg_D \log n}, and also showed for {D} sufficiently close to {1} there was a matching upper bound {k(n,D) \ll_D \log n}. This type of result was called a nonlinear Dvoretsky theorem, in analogy with the linear Dvoretsky theorem that asserts that given any {n}-dimensional normed vector space and {D > 1}, there existed a {k}-dimensional subspace which embedded with distortion {D} into {\ell^2}, with {k \sim_D \log n}. In other words, one can ensure that about {\log n} of the {n} points in an arbitrary metric space behave in an essentially Euclidean manner, and that this is best possible if one wants the distortion to be small.

Bartel, Linial, Mendel, and Naor observed that there was a threshold phenomenon at {D=2}. Namely, for {D < 2}, the Bourgain-Figiel-Milman bounds were sharp with {k(n,D) \sim_D \log n}; but for {D>2} one instead had a power law

\displaystyle  n^{1-B(D)} \ll_D k(n,D) \ll n^{1-b(D)}

for some {0 < b(D) \leq B(D) < 1}. In other words, once one allows distortion by factors greater than {2}, one can now embed a polynomial portion of the points, rather than just a logarithmic portion, into a Hilbert space. This has some applications in theoretical computer science to constructing “approximate distance oracles” for high-dimensional sets of data. (The situation at the critical value {D=2} is still unknown.)

In the special case that the metric {d} is an ultrametric, so that the triangle inequality {d(x,z) \leq d(x,y) + d(y,z)} is upgraded to the ultra-triangle inequality {d(x,z) \leq \max(d(x,y), d(y,z))}, then it is an easy exercise to show that {X} can be embedded isometrically into a Hilbert space, and in fact into a sphere of radius {\hbox{diam}(X)/\sqrt{2}}. Indeed, this can be established by an induction on the cardinality of {X}, using the ultrametric inequality to partition any finite ultrametric space {X} of two or more points into sets of strictly smaller diameter that are all separated by {\hbox{diam}(X)}, and using the inductive hypothesis and Pythagoras’ theorem.

One can then replace the concept of embedding into a Hilbert space, with the apparently stronger concept of embedding into an ultrametric space; this is useful for the computer science applications as ultrametrics have a tree structure which allows for some efficient algorithms for computing distances in such spaces. As it turns out, all the preceding constructions carry over without difficulty to this setting; thus, for {D<2}, one can embed a logarithmic number of points with distortion {D} into an ultrametric space, and for {D>2} one can embed a polynomial number of points.

One can view the task of locating a subset of a metric space that is equivalent (up to bounded distortion) to an ultrametric as that of fragmenting a metric space into a tree-like structure. For instance, the standard metric on the arithmetic progression {\{1,\ldots,3^m\}} of length {n=3^m} is not an ultrametric (and in fact needs a huge distortion factor of {3^m} in order to embed into an ultrametric space), but if one restricts to the Cantor-like subset of integers in {\{1,\ldots,3^m\}} whose base {3} expansion consists solely of {0}s and {1}s, then one easily verifies that the resulting set is fragmented enough that the standard metric is equivalent (up to a distortion factor of {3}) to an ultrametric, namely the metric {d(x,y) := 3^{j}/2} where {j} is the largest integer for which {3^j} divides {y-x} and {x \neq y}.

The above fragmentation constructions were somewhat complicated and deterministic in nature. Mendel and Naor introduced a simpler probabilistic method, based on random partition trees of {X}, which reproved the above results in the high-distortion case {D \gg 1} (with {b(D), B(D) \sim 1/D} in the limit {D \rightarrow \infty}). However, this method had inherent limitations in the low-distortion case, in particular failing for {D < 3} (basically because the method would establish a stronger embedding result which fails in that regime).

In this paper, we introduce a variant of the random partition tree method, which involves a random fragmentation at a randomly chosen set of scales, that works all the way down to {D>2} and gives a clean value for the exponent {b(D)}, namely {2e/D}; in fact we have the more precise result that we can take {b(D)} to be {1-\theta}, where {0 < \theta < 1} is the unique solution to the equation

\displaystyle  \frac{2}{D} = (1-\theta) \theta^{\theta/(1-\theta)},

and that this is the limit of the method if one chooses a “scale-oblivious” approach that applies uniformly to all metric spaces, ignoring the internal structure of each individual metric space.

The construction ends up to be relatively simple (taking about three pages). The basic idea is as follows. Pick two scales {R > r > 0}, and let {x_1, x_2, x_3, \ldots} be a random sequence of points in {X} (of course, being finite, every point in {X} will almost surely be visited infinitely often by this sequence). We can then partition {X} into pieces {Y_1, Y_2, \ldots}, by defining {Y_m} to be the set of points {x} for which {x_m} is the first point in the sequence to lie in the ball {B(x,R)}. We then let {S_m} be the subset of {Y_m} in which {x_m} actually falls in the smaller ball {B(x,r)}. As a consequence, we see that each {S_m} is contained in a ball of radius {r} (namely {B(x_m,r)}), but that any two {S_m, S_{m'}} are separated by a distance of at least {R-r} (from the triangle inequality). This gives one layer of a quasi-ultrametric tree structure on {\bigcup_m S_m}; if one iterates this over many different pairs of scales {R, r}, one gets a full quasi-ultrametric tree structure, which one can then adjust with bounded distortion to a genuine ultrametric structure. The game is then to optimise the choice of {R, r} so as to maximise the portion of {X} that remains in the tree; it turns out that a suitably random choice of such scales is the optimal one.