[This post is authored by Gil Kalai, who has kindly “guest blogged” this week’s “open problem of the week”. – T.]
This is a problem in discrete and convex geometry. It seeks to quantify the intuitively obvious fact that large convex bodies are so “fat” that they cannot avoid “detection” by a small number of observation points. More precisely, we fix a dimension d and make the following definition (introduced by Haussler and Welzl):
- Definition: Let be a finite set of points, and let . We say that a finite set is a weak -net for X (with respect to convex bodies) if, whenever B is a convex body which is large in the sense that , then B contains at least one point of Y. (If Y is contained in X, we say that Y is a strong -net for X with respect to convex bodies.)
For example, in one dimension, if , and where k is the integer part of , then Y is a weak -net for X with respect to convex bodies. Thus we see that even when the original set X is very large, one can create a -net of size as small as . Strong -nets are of importance in computational learning theory, and are fairly well understood via Vapnik-Chervonenkis (or VC) theory; however, the theory of weak -nets is still not completely satisfactory.
One can ask what happens in higher dimensions, for instance when X is a discrete cube . It is not too hard to cook up -nets of size (by using tools such as Minkowski’s theorem), but in fact one can create -nets of size as small as simply by taking a random subset of X of this cardinality and observing that “up to errors of “, the total number of essentially different ways a convex body can meet X grows at most polynomially in . (This is a very typical application of the probabilistic method.) On the other hand, since X can contain roughly disjoint convex bodies, each of which contains at least of the points in X, we see that no -net can have size much smaller than .
Now consider the situation in which X is now an arbitrary finite set, rather than a discrete cube. More precisely, let be the least number such that every finite set X possesses at least one weak -net for X with respect to convex bodies of cardinality at most . (One can also replace the finite set X with an arbitrary probability measure; the two formulations are equivalent.) Informally, f is the least number of “guards” one needs to place to prevent a convex body from covering more than of any given territory.
- Problem 1: For fixed d, what is the correct rate of growth of f as ?
It is already non-trivial (and somewhat surprising) that is even finite. This fact was first shown by Alon, Bárány, Füredi, and Kleitman (the planar case was achieved earlier by Bárány, Füredi, and Lovász), who established a bound of the form ; this was later improved to by Chazelle et al. In the other direction, the best lower bound is for some positive c(d); it was shown by Matousek that c(d) grows at least as fast as for some absolute constant c. Furthermore, many of the upper bound results come with actual (probabilistic) algorithms to locate these nets. Some better upper bounds are known when d=2 (where hyperbolic geometry can be used in a beautiful way) or when special assumptions are made on the set X (here, the bounds can be rather subtle, for instance inverse Ackermann functions show up in some cases!).
The finiteness of was an important ingredient in a paper of Alon and Kleitman which proved a Helly-type conjecture of Debrunner and Hadwiger. Recall that Helly’s theorem asserts that if a finite collection of convex bodies in has the property that any d+1 of them have non-empty intersection, then the whole family has non-empty intersection. Helly’s theorem already easily implies that for . Alon and Kleitman’s theorem asserts that for every integers and , , there is a function such that the following is true. Every finite family of convex sets in with the property that among every sets in the family there are with non empty intersection can be divided to at most intersecting subfamilies. (Helly’s theorem thus asserts that .)
If we replaced weak -nets by strong -nets then the situation is well understood; the analogue of would be finite if and only if the collection of convex bodies had finite Vapnik-Chervonenkis (or VC) dimension. However, it can be shown that the VC dimension of convex bodies is infinite, and so strong -nets can be required to be arbitrarily large for fixed . One can however use VC theory to show that, for instance, one can find a strong -net for any set with respect to, say, convex pentagons in the plane, whose size depends only on .)
One can talk about -nets in the more abstract setting of hypergraphs (a.k.a. set systems). Indeed, if is a hypergraph (thus A is a set of vertices, and is a collection of subsets (or “edges”) of A), and is any probability measure on A, we define a weak -net of to be a set which meets every edge of the hypergraph which has -measure greater than . We then define to be the least number such that every probability measure has a weak -net of cardinality at most .
- Problem 2: Find conditions on the hypergraph H which ensure that is finite. [It is already known that if H has a finite VC dimension D, then is finite, and in fact is bounded by ; the more interesting case is when H has infinite VC dimension.]
- Problem 3: Is it possible for to be finite but to be significantly larger than ? More precisely, does there exist a hypergraph H and such that for all sufficiently small ? It seems very unlikely that this is not the case, but no example of growth faster than has been constructed.
An improved understanding of both strong and weak -nets will help shed light on the integrality gap between vertex covering numbers (a.k.a. transversal numbers) and fractional vertex covering numbers (a.ka. fractional transversal numbers) of hypergraphs. Roughly speaking, the gap is the phenomenon that it can be more expensive to cover some territory if the amount of resources you can put into any one location is quantized to be an integer. For instance, suppose you want to cover the pentagon graph (i.e. the cycle of length 5) by vertices so that each edge is covered by at least one vertex; such a set of vertices is known as a vertex cover or transversal. Then one clearly needs at least three vertices to accomplish this. If however one is allowed to assign fractional weights to each vertex, then it becomes more efficient to give each vertex a weight of , then again each edge is covered by a total weight of 1 vertex, but now one has only used vertices instead of 3. These problems are of course special cases of integer programming, which is a much harder problem in general than linear programming (indeed, it can be NP-hard). For further discussion of these issues see this paper of Alon et al.
A good reference to much of the above work is Matousek’s book.
19 comments
Comments feed for this article
25 April, 2007 at 1:26 am
David Corfield
Have weak -nets found any use in computational learning theory?
25 April, 2007 at 5:57 am
Gil Kalai
Dear David, I do not know of such a use and will be happy to learn.
25 April, 2007 at 8:47 am
Gil Kalai
Dear David,
If the VC dimension of a family is finite, so strong -nets exist, it is possible to PAC-learn for every distribution on the elements of the ground set.
Let me demonstrate it with the set of planar pentagons: For every probability distribution for points in the plane, and a pentagon . Suppose you see sufficiently many examples ( where a point is drawn according to the probability distribution then with high probability you will be able to guess the answers for further questions with high level of success.
If you replace pentagons by arbitrary convex sets there is no hope for something so strong. If your distribution is concentrated on unit circle, every sequence of questions and answers represents some convex set.
We are left with a hope that maybe PAC learnability is possible when we restrict the class of probability distributions, or perhaps (more realistically) let the notion of error depends on how well the set is “supported” by . It would be nice to explore it. Good question!
Apropos learnability. I am not aware of a good criterion for the weaker notion of “PAC-testability”. Namely after a long list of questions as above to be able to tell with low chance of failure only if the queries represent some pentagon. (Or, in the general case some edge in your hypergraph.) PAC-learnability implies PAC testability. (So with pentagons we are OK.) But I am not aware of a more general condition that guarantee PAC-testability.
25 April, 2007 at 11:07 am
David Corfield
Dear Gil,
As you probably know, the direction Vapnik moved in was to Support Vector Machines, maximum-margin hyperplane classifiers. Although a VC-dimension could be found for the class of classifiers exceeding a certain margin, see Theorem 1 here, a PAC result couldn’t be given since the result was data-dependent.
However, Shawe-Taylor et al. employed the fat-shattering dimension rather than the VC-dimension to yield a PAC result for SVMs.
It would be interesting to know whther -nets had anything to say to this result.
25 April, 2007 at 10:12 pm
Gil Kalai
Dear David, I think the first interesting special case regarding the connection of weak -net and learnability is this:
In what sense (if any) can we statistically learn (or statistically test) planar convex sets.
26 April, 2007 at 11:23 am
osias
Why I am seeing “formula does not parse” everywhere?
25 June, 2007 at 6:00 am
Gil Kalai
Here, (divided in two parts,) is how the Hadwiger-Debrunner conjecture is proved and the role of the weak -net theorem in the proof. The original paper of Alon and Kleitman is a very good source, (and looking at original papers may qualify as a good general advice,) but the proof is so nice that I cannot resist the temptation trying to tell you about it. I will also mention some results of similar nature including the recent startling result by J. Fox, J. Pach, and Cs. D. Toth .
1. The -theorem.
Theorem (Alon and Kleitman): For every integers and , , there is a function such that the following is true:
Every finite family of convex sets in with the property that among every sets in the family there are with non empty intersection can be divided to at most intersecting subfamilies.
Helly’s theorem thus asserts that .
Let us look a little bit at this theorem. First consider the case of dimension one. Here, the theorem is valid for . This is a theorem about graphs. Suppose that . (In every dimension
the cases where are the “hardest,” since $M$ is monotone non increasing in the variable .)
2. Interval graphs and perfect graphs.
Given a collection of intervals in the line their intersection graph is a graph whose vertices correspond to the intervals, and where two vertices are adjacent if the corresponding intervals intersect. (Such a graph is called an interval graph.) The condition of the theorem asserts that the graph contains no independent set of vertices. The conclusion is that the graph can be covered by complete graphs. If we consider the complement graph , the condition asserts that the graph contains no complete subgraph of size and the conclusion is that it can be colored by colors.
A complete subgraph of size is an obstacle for coloring a graph in colors. But is it the only obstacle? Heavens, no!! There are graphs without triangles which require arbitrary large number of colors to be colored. What make the miracle here is that interval graph are perfect graphs. (Perfect graphs are defined precisely by the property that for every induced subgraph the chromatic number is precisely the maximum size of a complete subgraph.)
Graphs and hypergraphs arising in geometry behaves well! (in the same sense that the behavior of perfect graph is perfect.)
3. Proof of the Alon Kleitman -theorem.
We will consider the case: , and . Essentially this is the proof of the general case and talking about the (4,3) case in the plane is just in order to make the presentation as concrete as possible.
We remark that Hadwiger and Debrunner already proved (as a special case of a general result) that . (A sort of “perfect behavior” in the plane for .).
Here is the proof in five easy steps:
Step a: We have convex sets and out of every 4 there are 3 with non empty intersection. Therefore, at least a fraction of of all 3-tuples of sets have non empty intersection. This is easy. (But we can wonder if we are not giving up too much information.)
Step b: If a fraction of of 3-tuples of sets have non empty intersection then a fraction of of the sets has a point in common!
This is a result by Katchalski and Liu and it is called a “fractional Helly’s theorem”. The sharp version reads .
In the case of interest of us the fractional Helly theorem tells us that starting from our family of sets we can find a subfamily consisting of about 1/10 of the original sets with non empty intersection. This is truly a reason for hope and you can imagine that now we will be able to split the original family to 10 or maybe 20 intersecting families.
But when you see what happens when you delete this large intersecting family and continue you realize that without more ideas you get only a weaker result that the original family can be covered by $\latex O( \log n)$ intersecting families. This is not so good because we wanted the number of subfamilies not be a function of the size of the original family.
Step c: A weighted version of step b.
We saw that our family contains a large intersecting subfamily. Suppose you give every set a nonnegative weight and you want to find a point in the plane belonging to sets carrying 0.1 or 0.01 of the total weight. (Or, in other words, we want to find anintersecting subfamily which carries 0.1 or 0.01 of the total weight of all sets? )
Note that this weighted version is a consequence of the theorem we want to prove. (If we can find, say, 100 points in the plane meeting all the sets in the family, one of these points must meet sets carrying 1/100 of the total weight. So it can be a good idea to try to prove this weighted version along the way. (If all weights are the same we are back to step a) )
Alon and Kleitman observed that such a weighted version can be verified by “multiplying” some of the sets in the family and using again the Katchalski-Liu fractional version of Helly’s theorem.
We are almost ready to use weak -nets!
21 July, 2007 at 10:46 am
gilkalai
So, where were we? We are describing the proof of Alon and Kleitman’s -theorem. After steps a, b and c, we are almost ready to apply the weak -net result itself!
step d: Linear programming duality.
Let be a hypergraph with vertices. Suppose we want to cover the vertices by edges, namely to find a set of edges so that every vertex belongs to one or more of them.
In our case, we have a family of convex sets in the plane with the (4,3) property. We consider the following hypergraph:
The vertices in the hypergraph correspond to the convex sets in the family, and the edges correspond to intersecting subfamilies of convex sets.
Consider the following two linear programs:
(OPT1)
Assign nonnegative weights to vertices in the hypergraph so that
(1) the sum of weights of vertices in every edge is at most one.
(2) the sum of all weights is maximum.
(OPT2) (FRACTIONAL COVERING THE VERTICES BY EDGES):
Assign nonnegative weights for edges in the hypergraph so that:
(1) The sum of weights of edges containing any vertex is at least one.
(2) The sum of all weights is minimum.
Linear programming duality asserts that !
Moving to the dual linear programming problem, or juggling between the primal and dual problems is an extremely useful thing to do in combinatorics and I do not really understand why. It is also quite a bit confusing (for me — but I can always consult my colleague Nati Linial). It is probably useful to become fluent in moving to a dual linear programming problem and at the same time not being too blasé about it and not forgetting that this can be very helpful.
The problem we really want to solve is an integer programming problem:
(OPT2) (COVERING THE VERTICES BY EDGES):
Assign 0-1 weights for edges in the hypergraph so that:
(1) The sum of weights of edges containing any vertex is at least one.
(2) The sum of all weights is minimum.
The quantity we would like to find is . We need to pass the “integrality gap “from to ?
Step e: Use the weak net theorem.
The weak -net theorem is precisely what allows us go from the fractional covering problem to the covering problem.
Another formulation of the weak -net theorem (in the plane) is: There is a function , so for every hypergraph arising from a family of convex sets in the plane: .
We are done with the description of the proof and continue with little further remarks.
4. Other theorem of similar nature showing how graphs and hypergraphs arising in geometry are very special.
There are several cases where the Alon-Kleitman method can be followed to obtain other theorems. There were quite a few cases where theorems of a similar spirit were obtained by various other methods.
I will mention three theorems:
Theorem (D.Larman, J. Matousek, J. Pach, J. Torocsik): For a family of planar convex sets either there is a subfamily of sets which are pairwise intersecting or there is a subfamily of sets which are pairwise disjoint.
(D. Larman, J. Matousek, J. Pach, J. Torocsik, A Ramsey-type result for convex sets. Bull. London Math. Soc. 26 (1994), no. 2, 132–136.)
A “-theorem” which also directly follows from this paper is:
For a family of planar convex sets either
there is a family of which are pairwise disjoint or
there is a family of which are pairwise intersecting.
(It is not known if it is possible to replace “pairwise disjoint” with “pairwise intersecting” in this last theorem. Fox and Pach have some results in this direction.)
And (finally) the following beautiful theorem:
Theorem (J. Fox, J. Pach and Cs. D. Toth):
Every family of plane convex sets contains two subfamilies of size such that:
either each element of the first intersects every element in the second, or
no element in the first intersects any element of the other.
5. An example from a different area where weak -nets (or, equivalently, control of the integrability gap between covering numbers and fractional covering numbers) can be useful.
J. Bourgain and L. Tzafriri considered the following scenario: Let be a real number. Let be a matrix with norm 1 and with zeroes on the diagonal. An by principal minor is “good” if the norm of is less than .
Consider the following hypergraph :
The vertices correspond to indices . A set belongs to if the sub-matrix of is good.
Bourgain and Tzafriri showed that for every there is so that for every matrix we can find so that .
Moreover, they showed that for every nonnegative weights there is so that the sum of the weights in is at least times the total weight. In other words, (by LP duality, just as we saw above) the vertices of the hypergraph can be fractionally covered by $C(a)$ edges.
The “big question” is if there a real number so that for every matrix can be covered by good sets. Or, in other words, if the vertices of can be covered by edges.
This question is known to be equivalent to an old conjecture by Kadison and Singer.
In view of what was already proved by Bourgain and Tzafriri what is needed is to show that the covering number is bounded from above by a function of the fractional covering number or, in other words, a certain weak -net property.
Probably the main difficulty in the problem is analytical and not combinatorial. But general conditions that guarantee weak -net, can suggest what analytical properties one should look for. And vice versa – from a solution to this problem we may be able to extract such conditions which can apply to other problems.
6. Finding more examples of graphs and hypergraphs where the covering number is not bounded as a function of fractional covering number is also very interesting.
28 July, 2007 at 10:39 am
Gil Kalai
The theorem that follows from Larman’s et als. paper is this
(I only formulated a weaker version):
For a family of planar convex sets either
A) there is a family of which are pairwise disjoint or
or
B) the entire family can be divided to subfamilies whose members are pairwise intersecting.
29 April, 2008 at 1:46 am
Combinatorics, mathematics, academics, polemics, … « Combinatorics and more
[…] One was about the weak epsilon net problem. […]
8 December, 2008 at 5:56 am
Gil Kalai
Boris Buck, Jiri Matousek, and Gabriel Nivasch proved recently that . !!!
8 June, 2009 at 12:50 am
Gil Kalai
The link to Buck, Matousek, and Nivasch’s paper is
http://front.math.ucdavis.edu/0812.5039
3 May, 2011 at 12:01 pm
Sergiu
Is there a list of open problems for (weak) epsilon-nets?
2 July, 2014 at 10:10 am
Gil Kalai
Here are links to a lecture by Noga Alon: 48 minutes on strong and weak epsilon nets: https://www.youtube.com/watch?v=LyLC-Jbd2Do (Part I) and https://www.youtube.com/watch?v=_rF7qQFPX1A (Part II)
25 November, 2017 at 7:45 am
Basic Notions Seminar is Back! Helly Type Theorems and the Cascade Conjecture | Combinatorics and more
[…] the existence of weak epsilon nets and the basic question about finding bounds for their size. My first ever blog post over Terry Tao’s blog was about weak epsilon nets and since then there were exciting […]
4 April, 2018 at 11:03 pm
Nathan Rubin Improved the Bound for Planar Weak ε-Nets and Other News From Ein-Gedi | Combinatorics and more
[…] himself announced a great result with new upper bounds for planar weak ε-nets. In 2007 I wrote a guest post (my first ever blog post) on the topic on Terry Tao’s blog. The next section is taken from that old post with reference […]
17 August, 2018 at 1:50 am
An Improved Bound for Weak Epsilon-Nets in the Plane, by Nathan Rubin, is on the arXive | Combinatorics and more
[…] The paper An Improved Bound for Weak Epsilon-Nets in the Plane, by Nathan Rubin is now on the arxive. we mentioned the result in this post, and the problem in this post (my very first) over Tao’s blog What’s New. […]
17 August, 2018 at 3:08 am
Gil Kalai
Here is a link to a paper of Nathan Rubin with a major improvement of the upper bound for weak epsilon-nets in the plane, from to for every . https://arxiv.org/abs/1808.02686 .
16 February, 2019 at 8:06 am
Attila Por’s Universality Result for Tverberg Partitions | Combinatorics and more
[…] and Nivasch (Here are links to the paper, and to slides from a talk by Boris Bukh. See also this post and this one of the weak epsilon net problem.) Roughly the idea is this: consider a stretched grid […]