I’ve just uploaded to the arXiv my joint paper with Tim Austin, “On the testability and repair of hereditary hypergraph properties“, which has been submitted to Random Structures and Algorithms. In this paper we prove some positive and negative results for the testability (and the local repairability) of various properties of directed or undirected graphs and hypergraphs, which can be either monochromatic or multicoloured.

The negative results have already been discussed in a previous posting of mine, so today I will focus on the positive results. The property testing results here are finitary results, but it turns out to be rather convenient to use a certain correspondence principle (the hypergraph version of the Furstenberg correspondence principle) to convert the question into one about exchangeable probability measures on spaces of hypergraphs (i.e. on random hypergraphs whose probability distribution is invariant under exchange of vertices). Such objects are also closely related to the”graphons” and “hypergraphons” that emerge as graph limits, as studied by Lovasz-Szegedy, Elek-Szegedy, and others. Somewhat amusingly, once one does so, it then becomes convenient to keep track of objects indexed by vertex sets and how they are exchanged via the language of category theory, and in particular using the concept of a natural transformation to describe such objects as exchangeable measures, graph colourings, and local modification rules. I will try to sketch out some of these connections, after describing the main positive results.

— Statement of results —

Property testing in graph and hypergraph theory has primarily focused on undirected graphs $(V, E)$, where $E \subset \binom{V}{2} := \{ e \subset V: |e| = 2 \}$ is a set of undirected edges of a finite vertex set V. In this paper, we consider some more general objects than undirected graphs, in increasing order of generality:

1. Undirected hypergraphs $(V,E)$ of some order k, where $E \subset \binom{V}{k} := \{ e \subset V: |e| = k \}$ is a collection of unordered edges on a vertex set V of order k;
2. Directed hypergraphs $(V,E)$ of some order k, where $E \subset \hbox{Inj}([k],V)$ is a collection of injective maps from the k-element set ${}[k] := \{1,\ldots,k\}$ to V (which should be viewed as directed edges of order k, or k-edges for short);
3. Coloured directed hypergraphs $(V,G_k)$ of some order k, in which $G_k: \hbox{Inj}([k],V) \to K_k$ is a colouring of all k-edges by some finite palette $K_k$;
4. Multiple coloured directed hypergraphs $(V,G_0,\ldots,G_k)$ of order k, in which $G_j: \hbox{Inj}([j],V) \to K_j$ is a colouring of j-edges by some component $K_j$ of a palette $K = (K_0,\ldots,K_k)$. We’ll call such hypergraphs K-hypergraphs for short. [It is natural, in view of the hypergraph regularity lemma, to consider multiple orders of hypergraphs at once, even if initially one is only interested in hypergraphs of a certain fixed order. For instance, one might be interested in graphs (which are hypergraphs of order 2), but the regularity lemma would then require one to partition the vertices into classes (thus creating a coloured hypergraph of order 1).]

(We also consider a continuous analogue of these concepts, in which the vertex set is a probability space rather than a discrete set, and loops are allowed but for simplicity we will not discuss this generalisation here.)

We now fix a palette K. For any vertex set V, let $K^{(V)}$ be the collection of all K-hypergraphs on V. Observe that every injective map $\phi: V \to W$ induces a pullback map $K^{(\phi)}: K^{(W)} \to K^{(V)}$, which is contravariant in the sense that $K^{(\phi \circ \psi)} = K^{(\psi)} \circ K^{(\phi)}$ for any pair of injections $\psi: U \to V$ and $\phi: V \to W$. Thus one can view K as a contravariant functor from a category of vertex sets to a category of spaces (which we will later identify as a category of sub-Cantor spaces).

We’ll be considering various properties of K-hypergraphs, or K-properties for short. Formally, a K-property ${\mathcal P}$ is an assignment $V \mapsto {\mathcal P}^{(V)}$ of a collection ${\mathcal P}^{(V)} \subset K^{(V)}$ of K-hypergraphs for each vertex set V; one should view ${\mathcal P}^{(V)}$ as the collection of K-hypergraphs on V which satisfy the property ${\mathcal P}$. We will restrict attention to those properties which are hereditary, which means that they are preserved under pullbacks; thus any given pullback map $K^{(\phi)}: K^{(W)} \to K^{(V)}$ restricts to a map ${\mathcal P}^{(\phi)}: {\mathcal P}^{(W)} \to {\mathcal P}^{(V)}$. Thus hereditary properties can also be viewed as contravariant functors.

[Traditionally, hereditary properties are defined as those properties of (hyper)graphs which are preserved under (hyper)graph isomorphism as well as (hyper)graph restriction, but it is not hard to see that the above definition is equivalent.]

For graphs, examples of hereditary properties include that of being complete, of being bipartite, of being planar, of being four-colourable, of being triangle-free, etc.

It is convenient to extend a K-property ${\mathcal P}$ from finite hypergraphs to infinite ones by declaring an infinite hypergraph to obey ${\mathcal P}$ if and only if all of its finite induced subhypergraphs obey ${\mathcal P}$. With this convention, ${\mathcal P}^{(V)}$ becomes a closed subspace of $K^{(V)}$ (which we give the product topology), and in particular is compact.

We say that a K-property ${\mathcal P}$ is locally testable with one-sided error (or testable for short) if one can probabilistically test for this property on large (hyper)graphs by verifying the property on a small random portions of the (hyper)graph. More formally, we say that ${\mathcal P}$ is testable if for every $\varepsilon > 0$ there exists an N and $\delta > 0$ such that if $G \in K^{(V)}$ is a K-hypergraph on a large vertex set V (with $|V| > N$) which N-locally obeys ${\mathcal P}$ to accuracy $1-\delta$ in the sense that the restriction of G to a randomly chosen N-element subset obeys ${\mathcal P}$ with probability $1-\delta$, then G itself is $\varepsilon$-close to obeying ${\mathcal P}$ in the sense that there is a K-hypergraph $G' \in {\mathcal P}^{(V)}$ such that G and G’ agree on a randomly chosen k-element subset of V with probability at least $1-\varepsilon$. [This notion of testability has one-sided error because there are no false negatives: if a local portion of G disobeys ${\mathcal P}$, then G itself certainly disobeys ${\mathcal P}$, since ${\mathcal P}$ is hereditary. See this web page of Goldreich for more discussion of property testing in general.]

Our first main positive result is:

Theorem 1. Every hereditary K-property is testable.

This result was previously shown by Alon and Shapira for undirected monochromatic graph properties, by Rodl and Schacht for undirected monochromatic hypergraph properties, and by Ishigami for partite hypergraph properties. There is also a substantial body of earlier partial results on property testing of graph and hypergraph properties, and properties of other combinatorial objects, which are too numerous to be listed here. Remarkably, this theorem can be used to prove some results in additive combinatorics, most notably Szemerédi’s theorem on arithmetic progressions; see for instance my Montreal notes for further discussion.

Our other positive results go beyond testability to prove a stronger property, which we have dubbed “local repairability”. This notion, analogous to that of local correctability in the theory of error-correcting codes, takes a bit of notation to define properly. Define a modification rule to be a pair $T = (A,\overline{T})$, where A is a finite set of vertices, and for each vertex set V, $\overline{T}^{(V)}: K^{(A \uplus V)} \to K^{(V)}$ is a map which converts K-hypergraphs on the disjoint union $A \uplus V$ to K-hypergraphs on V; one should view the vertex set A as supplying some “test data” in which to determine how to modify the rest of the hypergraph. We say that the modification rule is local if the diagram

$\begin{array}{ccc} K^{(A \uplus W)} & \stackrel{K^{(id_A \uplus \phi)}}{\longrightarrow} & K^{(A \uplus V)} \\ \downarrow^{\overline{T}^{(W)}} & & \downarrow^{\overline{T}^{(V)}} \\ K^{(W)} & \stackrel{K^{(\phi)}}{\longrightarrow} & K^{(V)} \end{array}$

commutes for all injections $\phi: V \to W$. (Apologies for the clunky diagram; I have had some difficulties with doing advanced LaTeX in wordpress.) Informally, a local modification rule will modify each edge based only on the connectivity of the vertices of that edge with A, and will not care about the precise labels assigned to those vertices. We say that a local modification rule entails a K-property ${\mathcal P}$ if we have $\overline{T}^{(V)}: K^{(A \uplus V)} \subset {\mathcal P}^{(V)}$ for all vertex sets V.

One can view a local modification rule T as defining a natural transformation $\overline{T}$ from the contravariant functor $K^{\uplus A}$ (which assigns the space $K^{(A \uplus V)}$ to each vertex set V) to the contravariant functor K. (We found it convenient to use the concept of a natural transformation in our paper, and in particular the standard fact that the composition of natural transformations is again a natural transformation, as it allowed us to easily localise or permute the vertex sets one was working on.)

Perhaps an example will illustrate what a “local modification rule” is. Suppose one has a graph G which is nearly complete bipartite, in the sense that one can partition the vertex set V into two vertex sets W and W’, such that at least 99% (say) of the edges between W and W’ lie in G, and at most 1% of the edges within W or within W’ lie in G. We would like to locally modify this graph to one which really is complete bipartite. One way to do this is to select a special vertex a inside V at random. One can then create a modified graph $T_a(G)$ on $V_a := V \backslash \{a\}$, defined by partitioning $V_a = W_a \cup W'_a$ into the vertices $W_a$ connected to a, and the vertices $W'_a$ not connected to a, and then forming the complete bipartite graph between $W_a$ and $W'_a$. One can view $T_a(G)$ as the action of a modification rule $(\{1\},T)$ on a pullback $K^{(\phi)}$ of G, where $\phi: \{1\} \uplus V_a \to V$ is the map that sends 1 to a and is the identity on $V_a$. The rule T entails the property of being complete bipartite. Also, observe that if a is chosen randomly, then $T_a(G)$ is close to G in the sense that any edge in G has at least a 95% chance (say) of being an edge in $T_a(G)$ (if V is large enough), and similarly any non-edge in G has at least a 95% chance of being a non-edge of $T_a(G)$.

We say that a K-property ${\mathcal P}$ is locally repairable if for every $\varepsilon > 0$ there exists $\delta > 0$, an N, and finite set A such that whenever $G \in K^{(V)}$ is a sufficiently large K-hypergraph which N-locally obeys ${\mathcal P}$ to accuracy $1-\delta$, then there exists local modification rule $T = (A, \overline{T})$ such that if $\phi: A \to V$ is a random injection, then the colour of any given edge in G will match the same edge in $T_\phi(G) := T^{(V \backslash \phi(A))}( K^{(\phi \cup \hbox{id}_{V \backslash \phi(A)})}(G) )$ with probability at least $1-\varepsilon$, where $\phi \cup \hbox{id}_{V \backslash \phi(A)}: A \uplus (V \backslash \phi(A)) \to V$ is the bijection that equals $\phi$ on A and the identity on $V \backslash \phi(A)$.

The above definition was quite convoluted, but informally, if a property ${\mathcal P}$ is locally repairable, then it is possible to probabilistically “repair” with high probability any large hypergraph G on V which “almost” obeys ${\mathcal P}$ by first selecting a bounded number of random test vertices $\phi(A)$, and modifying the colour of each edge outside of $\phi(A)$ by looking only at how the vertices of that edge connect with $\phi(A)$.

Local repairability is stronger than testability. Testability asserts that hypergraphs which nearly obey ${\mathcal P}$ can be modified to actually obey ${\mathcal P}$, but do not specify how to locate this modification; local repairability shows that such a modification can be built using a local modification rule. [Another indication of how repairability is stronger than testability is that we can use the repairability of various properties to deduce Ramsey's theorem, which does not seem to be the case for testability.]

With all this notation, we can state our next three theorems:

Theorem 2. Every undirected graph property is locally repairable.

Theorem 3. Every monotone hypergraph property is locally repairable.

Theorem 4. Every partite hypergraph property is locally repairable.

(There are precise definitions of what it means for a property to be undirected, monotone, or partite, which are given in our paper; but I will not give them here as they are not particularly illuminating.) These hypotheses are close to being sharp; we have counterexamples to local repairability for directed graph properties or undirected hypergraph properties once all assumptions of monotonicity or partiteness are dropped, as discussed in my previous post. I should remark that Theorem 4 is also implicitly present in a paper of Ishigami.

— The correspondence principle —

Our proofs of the above theorems use the correspondence principle between statements concerning finite hypergraphs, and statements concerning exchangeable measures on spaces of infinite hypergraphs (or equivalently, concerning infinite exchangeable random hypergraphs).

The basic way the correspondence principle is invoked is via the following “compactness and contradiction” argument, which also appears in many other places in mathematics (e.g. in PDE):

1. Suppose we are trying to prove some asymptotic statement about finite hypergraphs. We assume for contradiction that the claim is false, thus obtaining a sequence of hypergraphs $G_n \in K^{(V_n)}$ which form increasingly strong counterexamples to that statement. (Typically, the size of these graphs will go to infinity as n goes to infinity.)
2. We sample randomly from each of the finite graphs $G_n \in K^{(V_n)}$ to create a sequence of infinite random graphs $\tilde G_n \in K^{({\Bbb Z})}$, by randomly mapping ${\Bbb Z}$ into $V_n$ and pulling back the graph. (There will be some collisions when multiple integers map to the same vertex, but as the original graph gets larger, these collisions get rarer, so it will actually not make much difference what we do in the case of collisions.) One can view each infinite random graph $\tilde G_n$ as describing a Borel probability measure $\mu_n$ on the Cantor space $K^{({\Bbb Z})}$. This measure is exchangeable, in the sense that it is unchanged if we permute the underlying set of vertices ${\Bbb Z}$. Informally, this measure describes all the “local” or “statistical” information one can observe about a given hypergraph.
3. We then exploit the fact that the space of Borel probability measures on a given (standard Borel) space is vaguely sequentially compact, so that (after passing to a subsequence) we can extract a vague limit $\mu$ of the $\mu_n$. Since the $\mu_n$ are the laws of exchangeable random hypergraphs which are increasingly strong counterexamples to the initial statement one is trying to prove, the infinite random hypergraph corresponding to $\mu$ should be a “perfect” counterexample to this statement.
4. We then work in the setting of infinite exchangeable random hypergraphs to show that no such perfect counterexamples in fact exist.

Actually, in order to use the concept of a natural transformation effectively, it is convenient not to work with a single probability measure $\mu$ on a single space of hypergraphs, but instead to work with a separate probability measure $\mu^{(V)}$ on $K^{(V)}$ for each vertex set V, which is consistent with the pullback maps $K^{(\phi)}$. One can view $\mu$ as a natural transformation from a trivial functor $V \mapsto \hbox{pt}$ to the functor $V \mapsto K^{(V)}$, in which we allow morphisms to be probability kernels rather than continuous maps. (A probability kernel $P: X \rightsquigarrow Y$ between two standard Borel spaces is a map $x \mapsto P(x)$ which assigns a Borel probability measure P(x) on Y to every point in x in a weakly measurable manner, thus the map $x \mapsto P(x)(E)$ is measurable for every Borel set E in Y. One can think of a probability kernel as describing a random function from X to Y.)

By using the above correspondence principle, combined with a structural theorem of Austin for exchangeable random hypergraphs (which is the infinitary analogue of the hypergraph regularity lemma, and also generalises a classical theorem of de Finetti on exchangeable measures; it is closely related to similar results of Aldous and of Kallenberg) one can reformulate the concepts of testability and local repairability of a given K-property ${\mathcal P}$ in terms of natural transformations on sub-Cantor spaces (closed subspaces of Cantor spaces), where we allow morphisms on these spaces to either be deterministic (functions, possibly continuous), or probabilistic (i.e. probability kernels). The precise formulation is technical (requiring even more notation than just presented, in particular the concept of a regularised exchangeable recipe $V \mapsto \mu^{(V)}$, which is the infinitary analogue of a regularised graph or hypergraph), but roughly speaking, a property ${\mathcal P}$ is locally repairable if every regularised exchangeable recipe which almost surely obeys ${\mathcal P}$ can be modified by a deterministically continuous natural transformation to surely obey ${\mathcal P}$ while only modifying a small number of edges on average, whereas a${\mathcal P}$ is testable if every regularised exchangeable recipe with almost surely obeys ${\mathcal P}$ can be modified by a probabilistically continuous natural transformation (i.e. the morphisms are probability kernels which are continuous in the weak topology) whose output almost surely obeys ${\mathcal P}$ for each input, and which also only modifying a small number of edges on average.