The spectral proof of the Szemeredi regularity lemma

3 December, 2012 in expository, math.CO, math.SP | Tags: eigenvalues, eigenvectors, regularity lemma, szemeredi regularity lemma | by Terence Tao

Perhaps the most important structural result about general large dense graphs is the Szemerédi regularity lemma. Here is a standard formulation of that lemma:

Lemma 1 (Szemerédi regularity lemma) Let ${G = (V,E)}$ be a graph on ${n}$ vertices, and let ${\epsilon > 0}$ . Then there exists a partition ${V = V_1 \cup \ldots \cup V_M}$ for some ${M \leq M(\epsilon)}$ with the property that for all but at most ${\epsilon M^2}$ of the pairs ${1 \leq i \leq j \leq M}$ , the pair ${V_i, V_j}$ is ${\epsilon}$ -regular in the sense that

$\displaystyle | d( A, B ) - d( V_i, V_j ) | \leq \epsilon$

whenever ${A \subset V_i, B \subset V_j}$ are such that ${|A| \geq \epsilon |V_i|}$ and ${|B| \geq \epsilon |V_j|}$ , and ${d(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|/|A| |B|}$ is the edge density between ${A}$ and ${B}$ . Furthermore, the partition is equitable in the sense that ${||V_i| - |V_j|| \leq 1}$ for all ${1 \leq i \leq j \leq M}$ .

There are many proofs of this lemma, which is actually not that difficult to establish; see for instance these previous blog posts for some examples. In this post I would like to record one further proof, based on the spectral decomposition of the adjacency matrix of ${G}$ , which is essentially due to Frieze and Kannan. (Strictly speaking, Frieze and Kannan used a variant of this argument to establish a weaker form of the regularity lemma, but it is not difficult to modify the Frieze-Kannan argument to obtain the usual form of the regularity lemma instead. Some closely related spectral regularity lemmas were also developed by Szegedy.) I found recently (while speaking at the Abel conference in honour of this year’s laureate, Endre Szemerédi) that this particular argument is not as widely known among graph theory experts as I had thought, so I thought I would record it here.

For reasons of exposition, it is convenient to first establish a slightly weaker form of the lemma, in which one drops the hypothesis of equitability (but then has to weight the cells ${V_i}$ by their magnitude when counting bad pairs):

Lemma 2 (Szemerédi regularity lemma, weakened variant) . Let ${G = (V,E)}$ be a graph on ${n}$ vertices, and let ${\epsilon > 0}$ . Then there exists a partition ${V = V_1 \cup \ldots \cup V_M}$ for some ${M \leq M(\epsilon)}$ with the property that for all pairs ${(i,j) \in \{1,\ldots,M\}^2}$ outside of an exceptional set ${\Sigma}$ , one has

$\displaystyle | E(A,B) - d_{ij} |A| |B| | \ll \epsilon |V_i| |V_j| \ \ \ \ \ (1)$

whenever ${A \subset V_i, B \subset V_j}$ , for some real number ${d_{ij}}$ , where ${E(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|}$ is the number of edges between ${A}$ and ${B}$ . Furthermore, we have

$\displaystyle \sum_{(i,j) \in \Sigma} |V_i| |V_j| \ll \epsilon |V|^2. \ \ \ \ \ (2)$

Let us now prove Lemma 2. We enumerate ${V}$ (after relabeling) as ${V = \{1,\ldots,n\}}$ . The adjacency matrix ${T}$ of the graph ${G}$ is then a self-adjoint ${n \times n}$ matrix, and thus admits an eigenvalue decomposition

$\displaystyle T = \sum_{i=1}^n \lambda_i u_i^* u_i$

for some orthonormal basis ${u_1,\ldots,u_n}$ of ${{\bf C}^n}$ and some eigenvalues ${\lambda_1,\ldots,\lambda_n \in {\bf R}}$ , which we arrange in decreasing order of magnitude:

$\displaystyle |\lambda_1| \geq \ldots \geq |\lambda_n|.$

We can compute the trace of ${T^2}$ as

$\displaystyle \hbox{tr}(T^2) = \sum_{i=1}^n |\lambda_i|^2.$

But we also have ${\hbox{tr}(T^2) = 2|E| \leq n^2}$ , so

$\displaystyle \sum_{i=1}^n |\lambda_i|^2 \leq n^2. \ \ \ \ \ (3)$

Among other things, this implies that

$\displaystyle |\lambda_i| \leq \frac{n}{\sqrt{i}} \ \ \ \ \ (4)$

for all ${i \geq 1}$ .

Let ${F: {\bf N} \rightarrow {\bf N}}$ be a function (depending on ${\epsilon}$ ) to be chosen later, with ${F(i) \geq i}$ for all ${i}$ . Applying (3) and the pigeonhole principle (or the finite convergence principle, see this blog post), we can find ${J \leq C(F,\epsilon)}$ such that

$\displaystyle \sum_{J \leq i < F(J)} |\lambda_i|^2 \leq \epsilon^3 n^2.$

(Indeed, the bound on ${J}$ is basically ${F}$ iterated ${1/\epsilon^3}$ times.) We can now split

$\displaystyle T = T_1 + T_2 + T_3, \ \ \ \ \ (5)$

where ${T_1}$ is the “structured” component

$\displaystyle T_1 := \sum_{i < J} \lambda_i u_i^* u_i, \ \ \ \ \ (6)$

${T_2}$ is the “small” component

$\displaystyle T_2 := \sum_{J \leq i < F(J)} \lambda_i u_i^* u_i, \ \ \ \ \ (7)$

and ${T_3}$ is the “pseudorandom” component

$\displaystyle T_3 := \sum_{i > F(J)} \lambda_i u_i^* u_i. \ \ \ \ \ (8)$

We now design a vertex partition to make ${T_1}$ approximately constant on most cells. For each ${i < J}$ , we partition ${V}$ into ${O_{J,\epsilon}(1)}$ cells on which ${u_i}$ (viewed as a function from ${V}$ to ${{\bf C}}$ ) only fluctuates by ${O(\epsilon n^{-1/2} /J)}$ , plus an exceptional cell of size ${O( \frac{\epsilon}{J} |V|)}$ coming from the values where ${|u_i|}$ is excessively large (larger than ${\sqrt{\frac{J}{\epsilon}} n^{-1/2}}$ ). Combining all these partitions together, we can write ${V = V_1 \cup \ldots \cup V_{M-1} \cup V_M}$ for some ${M = O_{J,\epsilon}(1)}$ , where ${V_M}$ has cardinality at most ${\epsilon |V|}$ , and for all ${1 \leq i \leq M-1}$ , the eigenfunctions ${u_1,\ldots,u_{J-1}}$ all fluctuate by at most ${O(\epsilon/J)}$ . In particular, if ${1 \leq i,j \leq M-1}$ , then (by (4) and (6)) the entries of ${T_1}$ fluctuate by at most ${O(\epsilon)}$ on each block ${V_i \times V_j}$ . If we let ${d_{ij}}$ be the mean value of these entries on ${V_i \times V_j}$ , we thus have

$\displaystyle 1_B^* T_1 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (9)$

for any ${1 \leq i,j \leq M-1}$ and ${A \subset V_i, B \subset V_j}$ , where we view the indicator functions ${1_A, 1_B}$ as column vectors of dimension ${n}$ .

Next, we observe from (3) and (7) that ${\hbox{tr} T_2^2 \leq \epsilon^3 n^2}$ . If we let ${x_{ab}}$ be the coefficients of ${T_2}$ , we thus have

$\displaystyle \sum_{a,b \in V} |x_{ab}|^2 \leq \epsilon^3 n^2$

and hence by Markov’s inequality we have

$\displaystyle \sum_{a \in V_i} \sum_{b \in V_j} |x_{ab}|^2 \leq \epsilon^2 |V_i| |V_j| \ \ \ \ \ (10)$

for all pairs ${(i,j) \in \{1,\ldots,M-1\}^2}$ outside of an exceptional set ${\Sigma_1}$ with

$\displaystyle \sum_{(i,j) \in \Sigma_1} |V_i| |V_j| \leq \epsilon |V|^2.$

If ${(i,j) \in \{1,\ldots,M-1\}^2}$ avoids ${\Sigma_1}$ , we thus have

$\displaystyle 1_B^* T_2 1_A = O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (11)$

for any ${A \subset V_i, B \subset V_j}$ , by (10) and the Cauchy-Schwarz inequality.

Finally, to control ${T_3}$ we see from (4) and (8) that ${T_3}$ has an operator norm of at most ${n/\sqrt{F(J)}}$ . In particular, we have from the Cauchy-Schwarz inequality that

$\displaystyle 1_B^* T_3 1_A = O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (12)$

for any ${A, B \subset V}$ .

Let ${\Sigma}$ be the set of all pairs ${(i,j) \in \{1,\ldots,M\}^2}$ where either ${(i,j) \in \Sigma_1}$ , ${i = M}$ , ${j=M}$ , or

$\displaystyle \min(|V_i|, |V_j|) \leq \frac{\epsilon}{M} n.$

One easily verifies that (2) holds. If ${(i,j) \in \{1,\ldots,M\}^2}$ is not in ${\Sigma}$ , then by summing (9), (11), (12) and using (5), we see that

$\displaystyle 1_B^* T 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) + O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (13)$

for all ${A \subset V_i, B \subset V_j}$ . The left-hand side is just ${E(A,B)}$ . As ${(i,j) \not \in \Sigma}$ , we have

$\displaystyle |V_i|, |V_j| > \frac{\epsilon}{M} n$

and so (since ${M = O_{J,\epsilon}(1)}$ )

$\displaystyle n^2 / \sqrt{F(J)} \ll_{J,\epsilon} |V_i| |V_j| / \sqrt{F(J)}.$

If we let ${F}$ be a sufficiently rapidly growing function of ${J}$ that depends on ${\epsilon}$ , the second error term in (13) can be absorbed in the first, and (1) follows. This concludes the proof of Lemma 2.

To prove Lemma 1, one argues similarly (after modifying ${\epsilon}$ as necessary), except that the initial partition ${V_1,\ldots,V_M}$ of ${V}$ constructed above needs to be subdivided further into equitable components (of size ${\epsilon |V|/M+O(1)}$ ), plus some remainder sets which can be aggregated into an exceptional component of size ${O( \epsilon |V| )}$ (and which can then be redistributed amongst the other components to arrive at a truly equitable partition). We omit the details.

Remark 1 It is easy to verify that ${F}$ needs to be growing exponentially in ${J}$ in order for the above argument to work, which leads to tower-exponential bounds in the number of cells ${M}$ in the partition. It was shown by Gowers that a tower-exponential bound is actually necessary here. By varying ${F}$ , one basically obtains the strong regularity lemma first established by Alon, Fischer, Krivelevich, and Szegedy; in the opposite direction, setting ${F(J) := J}$ essentially gives the weak regularity lemma of Frieze and Kannan.

Remark 2 If we specialise to a Cayley graph, in which ${V = (V,+)}$ is a finite abelian group and ${E = \{ (a,b): a-b \in A \}}$ for some (symmetric) subset ${A}$ of ${V}$ , then the eigenvectors are characters, and one essentially recovers the arithmetic regularity lemma of Green, in which the vertex partition classes ${V_i}$ are given by Bohr sets (and one can then place additional regularity properties on these Bohr sets with some additional arguments). The components ${T_1, T_2, T_3}$ of ${T}$ , representing high, medium, and low eigenvalues of ${T}$ , then become a decomposition associated to high, medium, and low Fourier coefficients of ${A}$ .

Remark 3 The use of spectral theory here is parallel to the use of Fourier analysis to establish results such as Roth’s theorem on arithmetic progressions of length three. In analogy with this, one could view hypergraph regularity as being a sort of “higher order spectral theory”, although this spectral perspective is not as convenient as it is in the graph case.

15 comments

Comments feed for this article

3 December, 2012 at 7:52 pm

Shubhendu Trivedi

I have general question about Remark 3 (but concerning Szemeredi Regular Partitions). I apologize if it is too elementary.

A simple algorithm to generate a regular partition by Frieze-Kannan is over here http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.721 (they mention the paper that you cite over here as well). Note that here they only consider the first singular value to check for regularity at each stage.

My question is: Why wouldn’t this work for a Hypergraph Case? I understand that there is no analog for “SVD” for tensors. That is, the best $r$ rank approximation for $k$ order tensor $A$ , $Approx(A,r)$ has no solutions in general. The notable exceptions are two specific cases (for rank one tensors i.e. $r = 1$ and when $A$ is a matrix i.e. $k =2$ . Since in the Graph case the first singular value is all that we need, shouldn’t this work for the hypergraph case as well?

4 December, 2012 at 2:57 pm

Terence Tao

There is a subtlety in the hypergraph case which has to do with the fact that there are several different types of hypergraph regularity one could ask for. In particular there is an “order 1” hypergraph regularity which does indeed fit well with rank one approximation, but in many applications one actually needs a higher order notion of hypergraph regularity which does not interact well with bounded rank approximations.

To explain this, begin with the graph case. An edge set E can be viewed as an indicator function $1_E(v,w)$ of two vertex-valued variables $v, w$ . Graph regularity has to do with understanding sums such as

$\sum_{v \in V} \sum_{w \in V} 1_E(v,w) 1_A(v) 1_B(w)$

and this can be done by using the SVD to split $1_E(v,w)$ into finite rank components such as $f(v) g(w)$ .

Now consider a 3-uniform hypergraph, which now comes with an indicator function $1_E(u,v,w)$ . If one were interested in counting “order 1” sums such as

$\sum_{u,v,w \in V} 1_E(u,v,w) 1_A(u) 1_B(v) 1_C(w)$

then a bounded rank approximation to $1_E$ would be effective; this would correspond, roughly speaking, to the hypergraph regularity lemma of Chung. But in practice, this type of regularity is insufficient; one often needs to count expressions such as

$\sum_{u,v,w \in V} 1_E(u,v,w) 1_F(u,v) 1_G(v,w) 1_H(w,u)$ (*)

for some graphs F,G,H, and bounded rank approximations can be quite terrible for this purpose. (See for instance the discussion in this paper of Gowers.) In particular, there are 3-uniform hypergraphs with no bounded rank approxmiations which have nontrivial behaviour with respect to sums such as (*). Instead one has to consider “order 2” bounded rank approximations to $1_E(u,v,w)$ , using linear combinations of functions of the form $f(u,v) g(v,w) h(u,w)$ rather than $f(u) g(v) h(w)$ . (These two-variable functions $f(u,v), g(v,w), h(u,w)$ in turn then need to be approximated by “order 1” bounded rank functions.) This can still be done, but one is now quite far from the classical theory of rank.

5 December, 2012 at 2:05 pm

Shubhendu Trivedi

Couldn’t thank you enough for your answer!

So if I understood correct, there should still be a way to obtain Chung’s version in a simple algorithmic way as in the graph case suggested by Frieze and Kannan.

More precisely, there should be an analog of Lemma 2 in http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.721 for the higher order case.

That is, something along these lines must hold:

/// Let $W$ be a $k$ dimensional Tensor $V_1 \times V_2 \times \dots V_i$ , with $|V_i| = p_i$ and $W(i,j, \dots, k) \leq 1$ . If $\gamma$ is a small positive real number then,

(a) if there exists $A_i \subseteq V_i$ such that $|A_i| \geq \gamma p_i$ and $|W(A_1, A_2, \dots, A_k)| \geq \gamma |A_1||A_2| \dots |A_k|$ then $\sigma(W) \geq \gamma^{k+1} \sqrt{p_1 p_2 \dots p_k}$

(b) if $\sigma(W) \geq \gamma\sqrt{p_1 p_2 \dots p_k}$ then there exists $A_i \subseteq V_i$ such that $|A_i| \geq \gamma' p_i$ and $|W(A_1, A_2, \dots A_k)| \geq \gamma'|A_1||A_2| \dots A_k$ where $\gamma' = c \gamma^{k+1}$ , $c$ is a constant. Such $A_1, A_2, \dots A_k$ might be constructed in polynomial time. ///

Is this a correct way of thinking about it?

5 December, 2012 at 2:47 pm

Terence Tao

Well, I don’t know how exactly you will be defining the greatest singular value of a tensor, but something along these lines should be possible. Note that one can also essentially prove Chung’s regularity lemma by viewing a hypergraph $E \subset V_1 \times V_2 \times V_3$ (say) first as a graph between $V_1$ and $V_2 \times V_3$ , applying the usual regularity lemma to this, and then applying the regularity lemma one further time to the components of $V_2 \times V_3$ obtained from the first application of that lemma. (Actually, to make all this work properly, it is more convenient to work with a strong version of the regularity lemma that has an additional function parameter F.

5 December, 2012 at 2:50 pm

Shubhendu Trivedi

Thanks a lot again for your reply!

PS: Minor comment. One tag is misspelled as “eigenvales”.

[Corrected, thanks -T.]

3 December, 2012 at 8:19 pm

David Roberts

A couple of orphaned/mismatched html tags:

In the paragraph after Lemma 1:

[Fixed, thanks – T.]

4 December, 2012 at 12:57 am

Craig

Hi Terry,

I am not a mathematician or anything. Just a college undergrad studying their gen eds…

But reading a quote from Paul Graham, from the link on the left of your blog, which was an amazing read by the way,

“To a newly arrived undergraduate, all university departments look much the same. The professors all seem forbiddingly intellectual and publish papers unintelligible to outsiders. But while in some fields the papers are unintelligible because they’re full of hard ideas, in others they’re deliberately written in an obscure way to seem as if they’re saying something important. This may seem a scandalous proposition, but it has been experimentally verified, in the famous Social Text affair. Suspecting that the papers published by literary theorists were often just intellectual-sounding nonsense, a physicist deliberately wrote a paper full of intellectual-sounding nonsense, and submitted it to a literary theory journal, which published it.”

I hope you are not writing anything in an obscure way to seem as if what you are writing is important :(

Sincerely,

Craig

16 December, 2012 at 2:25 pm

davetweed

Note that the quote’s logic isn’t quite right. Submitting a paper full of nonsense (which has been done in other disciplines, eg, theoretical physics) isn’t capable of experimentally verifying that a substantial number of the papers in a discipline are nonsense, only that papers in the discipline can’t be reliably distinguished from nonsense. (The proposition may well still be true, but that’s not established by this test.)

Even with my cynics hat on, I’m more inclined towards the weaker interpretation: referees have no strong incentive to actually understand a paper rather than just see if anything it says looks “obviously wrong”.

10 December, 2012 at 5:47 pm

Anonymous

In (1), Lemma 2, should $d_{i,j} |V_i| |V_j|$ be $d_{i,j} |A| |B|$ ?

[Corrected, thanks – T.]

28 December, 2012 at 6:07 am

Balazs Szegedy

Dear Terry,

For the sake of clompleteness let me mention that I wrote a paper about the spectral proof of the stronng regularity lemma which clarifies the same issues that you discuss (see below). It may also contain some new interesting aspect. (For example connection to graph limits and group theory)

Limits of kernel operators and the spectral regularity lemma

http://arxiv.org/abs/1003.5588

European J. of Comb, Volume 32, Issue 7, October 2011, p. 1156-1167

[Thanks, I’e added a link to this in the main post. Good to see that we now have spectral proofs of the weak, strong, and standard regularity lemmas in the literature – T.]

15 January, 2013 at 3:15 pm

Anonymous

Don’t want to sound uncaring for your precious time at all. If at all it comes across like that, I apologize for it. But I was wondering if it would ever be possible to have an expository post for the paper Balazs just posted?
I have spent a couple of months trying to understand it but it seems impregnable but at the same time very important and deep.

Regards.

29 October, 2013 at 8:09 pm

A spectral theory proof of the algebraic regularity lemma | What's new

[…] group. It turns out that the spectral proof of the Szemerédi regularity lemma (discussed in this previous blog post) adapts very nicely to this setting. The key fact needed about definable sets over finite fields is […]

29 March, 2014 at 5:31 am

Anonymous

two lines below inequality (10): $\Sigma$ should be $\Sigma_1$ ?

[Corrected, thanks – T.]

24 April, 2014 at 3:11 pm

A proof of Roth’s theorem | What's new

[…] and “highly pseudorandom” components, as is common in the subject (e.g. in this previous blog post), but even though we crucially need to retain non-negativity of one of the components in this […]

13 November, 2015 at 1:16 pm

The spectral proof of the Szemeredi regularity lemma | Systems, Networks and Control

[…] Source: The spectral proof of the Szemeredi regularity lemma […]

	Alex Gunning on A symmetric formulation of the…
	Terence Tao on On product representations of…
	domotorp on On product representations of…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on A symmetric formulation of the…
	Anonymous on On product representations of…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Alex Gunning on A symmetric formulation of the…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on It ought to be common knowledg…
	Anonymous on Work hard

The spectral proof of the Szemeredi regularity lemma

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

15 comments

Leave a comment Cancel reply

For commenters

The spectral proof of the Szemeredi regularity lemma

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

15 comments

Leave a comment Cancel reply

For commenters