The inclusion-exclusion principle for commuting projections

23 February, 2021 in expository, math.CO, math.OA, math.RA | Tags: idempotents, inclusion-exclusion principle, Mobius function, order theory, posets | by Terence Tao

The (classical) Möbius function ${\mu: {\bf N} \rightarrow {\bf Z}}$ is the unique function that obeys the classical Möbius inversion formula:

Proposition 1 (Classical Möbius inversion) Let ${f,g: {\bf N} \rightarrow A}$ be functions from the natural numbers to an additive group ${A}$ . Then the following two claims are equivalent:

(i) ${f(n) = \sum_{d|n} g(d)}$ for all ${n \in {\bf N}}$ .
(ii) ${g(n) = \sum_{d|n} \mu(n/d) f(d)}$ for all ${n \in {\bf N}}$ .

There is a generalisation of this formula to (finite) posets, due to Hall, in which one sums over chains ${n_0 > \dots > n_k}$ in the poset:

Proposition 2 (Poset Möbius inversion) Let ${{\mathcal N}}$ be a finite poset, and let ${f,g: {\mathcal N} \rightarrow A}$ be functions from that poset to an additive group ${A}$ . Then the following two claims are equivalent:

(i) ${f(n) = \sum_{d \leq n} g(d)}$ for all ${n \in {\mathcal N}}$ , where ${d}$ is understood to range in ${{\mathcal N}}$ .
(ii) ${g(n) = \sum_{k=0}^\infty (-1)^k \sum_{n = n_0 > n_1 > \dots > n_k} f(n_k)}$ for all ${n \in {\mathcal N}}$ , where in the inner sum ${n_0,\dots,n_k}$ are understood to range in ${{\mathcal N}}$ with the indicated ordering.
(Note from the finite nature of ${{\mathcal N}}$ that the inner sum in (ii) is vacuous for all but finitely many ${k}$ .)

Comparing Proposition 2 with Proposition 1, it is natural to refer to the function ${\mu(d,n) := \sum_{k=0}^\infty (-1)^k \sum_{n = n_0 > n_1 > \dots > n_k = d} 1}$ as the Möbius function of the poset; the condition (ii) can then be written as

$\displaystyle g(n) = \sum_{d \leq n} \mu(d,n) f(d).$

Proof: If (i) holds, then we have

$\displaystyle g(n) = f(n) - \sum_{d<n} g(d) \ \ \ \ \ (1)$

for any ${n \in {\mathcal N}}$ . Iterating this we obtain (ii). Conversely, from (ii) and separating out the ${k=0}$ term, and grouping all the other terms based on the value of ${d:=n_1}$ , we obtain (1), and hence (i). $\Box$

In fact it is not completely necessary that the poset ${{\mathcal N}}$ be finite; an inspection of the proof shows that it suffices that every element ${n}$ of the poset has only finitely many predecessors ${\{ d \in {\mathcal N}: d < n \}}$ .

It is not difficult to see that Proposition 2 includes Proposition 1 as a special case, after verifying the combinatorial fact that the quantity

$\displaystyle \sum_{k=0}^\infty (-1)^k \sum_{d=n_k | n_{k-1} | \dots | n_1 | n_0 = n} 1$

is equal to ${\mu(n/d)}$ when ${d}$ divides ${n}$ , and vanishes otherwise.

I recently discovered that Proposition 2 can also lead to a useful variant of the inclusion-exclusion principle. The classical version of this principle can be phrased in terms of indicator functions: if ${A_1,\dots,A_\ell}$ are subsets of some set ${X}$ , then

$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_{k=0}^\ell (-1)^k \sum_{1 \leq j_1 < \dots < j_k \leq \ell} 1_{A_{j_1} \cap \dots \cap A_{j_k}}.$

In particular, if there is a finite measure ${\nu}$ on ${X}$ for which ${A_1,\dots,A_\ell}$ are all measurable, we have

$\displaystyle \nu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_{k=0}^\ell (-1)^k \sum_{1 \leq j_1 < \dots < j_k \leq \ell} \nu( A_{j_1} \cap \dots \cap A_{j_k} ).$

One drawback of this formula is that there are exponentially many terms on the right-hand side: ${2^\ell}$ of them, in fact. However, in many cases of interest there are “collisions” between the intersections ${A_{j_1} \cap \dots \cap A_{j_k}}$ (for instance, perhaps many of the pairwise intersections ${A_i \cap A_j}$ agree), in which case there is an opportunity to collect terms and hopefully achieve some cancellation. It turns out that it is possible to use Proposition 2 to do this, in which one only needs to sum over chains in the resulting poset of intersections:

Proposition 3 (Hall-type inclusion-exclusion principle) Let ${A_1,\dots,A_\ell}$ be subsets of some set ${X}$ , and let ${{\mathcal N}}$ be the finite poset formed by intersections of some of the ${A_i}$ (with the convention that ${X}$ is the empty intersection), ordered by set inclusion. Then for any ${E \in {\mathcal N}}$ , one has
$\displaystyle 1_E \prod_{F \subsetneq E} (1 - 1_F) = \sum_{k=0}^\ell (-1)^k \sum_{E = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} 1_{E_k} \ \ \ \ \ (2)$
where ${F, E_0,\dots,E_k}$ are understood to range in ${{\mathcal N}}$ . In particular (setting ${E}$ to be the empty intersection) if the ${A_j}$ are all proper subsets of ${X}$ then we have
$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_{k=0}^\ell (-1)^k \sum_{X = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} 1_{E_k}. \ \ \ \ \ (3)$
In particular, if there is a finite measure ${\nu}$ on ${X}$ for which ${A_1,\dots,A_\ell}$ are all measurable, we have
$\displaystyle \mu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_{k=0}^\ell (-1)^k \sum_{X = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} \mu(E_k).$

Using the Möbius function ${\mu}$ on the poset ${{\mathcal N}}$ , one can write these formulae as

$\displaystyle 1_E \prod_{F \subsetneq E} (1 - 1_F) = \sum_{F \subseteq E} \mu(F,E) 1_F,$

$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_F \mu(F,X) 1_F$

and

$\displaystyle \nu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_F \mu(F,X) \nu(F).$

Proof: It suffices to establish (2) (to derive (3) from (2) observe that all the ${F \subsetneq X}$ are contained in one of the ${A_j}$ , so the effect of ${1-1_F}$ may be absorbed into ${1 - 1_{A_j}}$ ). Applying Proposition 2, this is equivalent to the assertion that

$\displaystyle 1_E = \sum_{F \subseteq E} 1_F \prod_{G \subsetneq F} (1 - 1_G)$

for all ${E \in {\mathcal N}}$ . But this amounts to the assertion that for each ${x \in E}$ , there is precisely one ${F \subseteq E}$ in ${{\mathcal n}}$ with the property that ${x \in F}$ and ${x \not \in G}$ for any ${G \subsetneq F}$ in ${{\mathcal N}}$ , namely one can take ${F}$ to be the intersection of all ${G \subseteq E}$ in ${{\mathcal N}}$ such that ${G}$ contains ${x}$ . $\Box$

Example 4 If ${A_1,A_2,A_3 \subsetneq X}$ with ${A_1 \cap A_2 = A_1 \cap A_3 = A_2 \cap A_3 = A_*}$ , and ${A_1,A_2,A_3,A_*}$ are all distinct, then we have for any finite measure ${\nu}$ on ${X}$ that makes ${A_1,A_2,A_3}$ measurable that
$\displaystyle \nu(X \backslash (A_1 \cup A_2 \cup A_3)) = \nu(X) - \nu(A_1) - \nu(A_2) \ \ \ \ \ (4)$

$\displaystyle - \nu(A_3) - \nu(A_*) + 3 \nu(A_*)$
due to the four chains ${X \supsetneq A_1}$ , ${X \supsetneq A_2}$ , ${X \supsetneq A_3}$ , ${X \supsetneq A_*}$ of length one, and the three chains ${X \supsetneq A_1 \supsetneq A_*}$ , ${X \supsetneq A_2 \supsetneq A_*}$ , ${X \supsetneq A_3 \supsetneq A_*}$ of length two. Note that this expansion just has six terms in it, as opposed to the ${2^3=8}$ given by the usual inclusion-exclusion formula, though of course one can reduce the number of terms by combining the ${\nu(A_*)}$ factors. This may not seem particularly impressive, especially if one views the term ${3 \mu(A_*)}$ as really being three terms instead of one, but if we add a fourth set ${A_4 \subsetneq X}$ with ${A_i \cap A_j = A_*}$ for all ${1 \leq i < j \leq 4}$ , the formula now becomes
$\displaystyle \nu(X \backslash (A_1 \cup A_2 \cup A_3 \cap A_4)) = \nu(X) - \nu(A_1) - \nu(A_2) \ \ \ \ \ (5)$

$\displaystyle - \nu(A_3) - \nu(A_4) - \nu(A_*) + 4 \nu(A_*)$
and we begin to see more cancellation as we now have just seven terms (or ten if we count ${4 \nu(A_*)}$ as four terms) instead of ${2^4 = 16}$ terms.

Example 5 (Variant of Legendre sieve) If ${q_1,\dots,q_\ell > 1}$ are natural numbers, and ${a_1,a_2,\dots}$ is some sequence of complex numbers with only finitely many terms non-zero, then by applying the above proposition to the sets ${A_j := q_j {\bf N}}$ and with ${\nu}$ equal to counting measure weighted by the ${a_n}$ we obtain a variant of the Legendre sieve
$\displaystyle \sum_{n: (n,q_1 \dots q_\ell) = 1} a_n = \sum_{k=0}^\ell (-1)^k \sum_{1 |' d_1 |' \dots |' d_k} \sum_{n: d_k |n} a_n$
where ${d_1,\dots,d_k}$ range over the set ${{\mathcal N}}$ formed by taking least common multiples of the ${q_j}$ (with the understanding that the empty least common multiple is ${1}$ ), and ${d |' n}$ denotes the assertion that ${d}$ divides ${n}$ but is strictly less than ${n}$ . I am curious to know of this version of the Legendre sieve already appears in the literature (and similarly for the other applications of Proposition 2 given here).

If the poset ${{\mathcal N}}$ has bounded depth then the number of terms in Proposition 3 can end up being just polynomially large in ${\ell}$ rather than exponentially large. Indeed, if all chains ${X \supsetneq E_1 \supsetneq \dots \supsetneq E_k}$ in ${{\mathcal N}}$ have length ${k}$ at most ${k_0}$ then the number of terms here is at most ${1 + \ell + \dots + \ell^{k_0}}$ . (The examples (4), (5) are ones in which the depth is equal to two.) I hope to report in a later post on how this version of inclusion-exclusion with polynomially many terms can be useful in an application.

Actually in our application we need an abstraction of the above formula, in which the indicator functions are replaced by more abstract idempotents:

Proposition 6 (Hall-type inclusion-exclusion principle for idempotents) Let ${A_1,\dots,A_\ell}$ be pairwise commuting elements of some ring ${R}$ with identity, which are all idempotent (thus ${A_j A_j = A_j}$ for ${j=1,\dots,\ell}$ ). Let ${{\mathcal N}}$ be the finite poset formed by products of the ${A_i}$ (with the convention that ${1}$ is the empty product), ordered by declaring ${E \leq F}$ when ${EF = E}$ (note that all the elements of ${{\mathcal N}}$ are idempotent so this is a partial ordering). Then for any ${E \in {\mathcal N}}$ , one has
$\displaystyle E \prod_{F < E} (1-F) = \sum_{k=0}^\ell (-1)^k \sum_{E = E_0 > E_1 > \dots > E_k} E_k. \ \ \ \ \ (6)$
where ${F, E_0,\dots,E_k}$ are understood to range in ${{\mathcal N}}$ . In particular (setting ${E=1}$ ) if all the ${A_j}$ are not equal to ${1}$ then we have
$\displaystyle \prod_{j=1}^\ell (1-A_j) = \sum_{k=0}^\ell (-1)^k \sum_{1 = E_0 > E_1 > \dots > E_k} E_k.$

Morally speaking this proposition is equivalent to the previous one after applying a “spectral theorem” to simultaneously diagonalise all of the ${A_j}$ , but it is quicker to just adapt the previous proof to establish this proposition directly. Using the Möbius function ${\mu}$ for ${{\mathcal N}}$ , we can rewrite these formulae as

$\displaystyle E \prod_{F < E} (1-F) = \sum_{F \leq E} \mu(F,E) 1_F$

and

$\displaystyle \prod_{j=1}^\ell (1-A_j) = \sum_F \mu(F,1) 1_F.$

Proof: Again it suffices to verify (6). Using Proposition 2 as before, it suffices to show that

$\displaystyle E = \sum_{F \leq E} F \prod_{G < F} (1 - G) \ \ \ \ \ (7)$

for all ${E \in {\mathcal N}}$ (all sums and products are understood to range in ${{\mathcal N}}$ ). We can expand

$\displaystyle E = E \prod_{G < E} (G + (1-G)) = \sum_{{\mathcal A}} (\prod_{G \in {\mathcal A}} G) (\prod_{G < E: G \not \in {\mathcal A}} (1-G)) \ \ \ \ \ (8)$

where ${{\mathcal A}}$ ranges over all subsets of ${\{ G \in {\mathcal N}: G \leq E \}}$ that contain ${E}$ . For such an ${{\mathcal A}}$ , if we write ${F := \prod_{G \in {\mathcal A}} G}$ , then ${F}$ is the greatest lower bound of ${{\mathcal A}}$ , and we observe that ${F (\prod_{G < E: G \not \in {\mathcal A}} (1-G))}$ vanishes whenever ${{\mathcal A}}$ fails to contain some ${G \in {\mathcal N}}$ with ${F \leq G \leq E}$ . Thus the only ${{\mathcal A}}$ that give non-zero contributions to (8) are the intervals of the form ${\{ G \in {\mathcal N}: F \leq G \leq E\}}$ for some ${F \leq E}$ (which then forms the greatest lower bound for that interval), and the claim (7) follows (after noting that ${F (1-G) = F (1-FG)}$ for any ${F,G \in {\mathcal N}}$ ). $\Box$

9 comments

Comments feed for this article

23 February, 2021 at 6:57 pm

Sam Hopkins

Your propositions feel very similar to “Rota’s crosscut theorem.” See for instance Section 3.9 of Stanley’s EC1, available at http://www-math.mit.edu/~rstan/ec/ec1/.

26 February, 2021 at 10:43 pm

Aditya Guha Roy

I had a similar feeling at first glance. But proposition 6 is actually not exactly the same as the crosscut theorem; it deals with a different treatment.

24 February, 2021 at 6:17 am

allenknutson

My favorite statement about Möbius inversion is the computation of the Möbius function for a simplicial complex (i.e. the coefficients needed when trying to write the function 1 as a linear combination of the characteristic functions of the closed simplices). It says: the coefficient of a face is 1 – the Euler characteristic of the link of that face.

When the simplicial complex is a (shellable) ball, the links of interior faces are spheres, giving the coefficient (-1)^codimension. Whereas the links of exterior faces are hemispheres, giving the coefficient 0.

24 February, 2021 at 12:53 pm

Anonymous

Is it true that since only finitary operations (e.g. finite sums) are involved, all proofs of such results can be based on only formal(!) arguments?

26 February, 2021 at 10:58 pm

Aditya Guha Roy

In trying to extend the idea to infinite posets you may need additional conditions to deal with the infinite products and sums which you’ll encounter; for instance if you look at Proposition 2, then you can see how the sum in ii can bother you if you try to extend the idea to deal with an infinite poset.

25 February, 2021 at 8:42 am

Oliver Knill

Here is a topological angle to the Moebius picture (complementing Allen Knutsen’s remark): a poset P has as its Barycentric refinement the finite abstract simplicial complex G in which the elements are the non-empty subsets of P. For a subset X of G, the number chi(X) = sum_{x in X} omega(x) is known as the Euler characteristic of X, where omega(x) = (-1)^dim(x), dim(x)=|x|-1 and |x| is cardinality. With the core W^-(x)={y, y x } one can write the Moebius function as mu(x,y) =chi(W^+(x) \cap W^-(y)). (I like to think of W^+ and W^- as unstable and stable manifolds and W^+ cap W^- as a heteroclinic point The Hall equations then read f(x) = sum_{y<x} g(y), g(x) = sum_{y A to a ring A, define chi(X)=sum_{x in X} h(x) and the symmetric matrices L(x,y)=chi(W^-(x) cap W^-(y)) g(x,y) = omega(x) omega(y) chi(W^+(x) cap W^+(y)). The later can be seen as Green functions because g is the inverse of the Laplacian L in the case if h takes values in the units One should think of g(x,y) as the potential energy between x and y. As V(x) = sum_y g(x,y) is a potential or index, Poincare-Hopf then is chi(G) = sum_{x,y in G} g(x,y). This so far only uses the additive structure of A. If A is a ring, then det(L)=det(g)=prod_{x in G} h(x). In the topological case h(x)=omega(x), the number theoretically interesting h(x)=1 or if h(x) takes values in the unit circle of the complex plane, L and g^* are unimodular matrices which are inverses of each other. For h(x)=1, one gets positive definite integral quadratic forms which are inverses of each other. Inclusion-exclusion is used in the proof https://arxiv.org/abs/2010.09152 .

4 June, 2021 at 4:32 pm

Benjamin Steinberg

Proposition 6 is a slight variant of Theorem 1 of Louis Solomon’s classical paper The Burnside algebra of a finite group, Journal of Combinatorial Theory Volume 2, Issue 4, June 1967, Pages 603-615. Theorem 1 and the remark after it show that the primitive idempotents of the algebra of a finite meet semilattice $L$ are of the form $e_x=\sum_{y\leq x}y\mu(y,x)$ (he does something slightly more general). Your commuting projections generate a finite meet semilattice.

24 June, 2021 at 2:37 pm

Nishad T M

Prof Tao,
I am Nishad T M from Kerala, India. During Lockdown, I tried to bring the Mathematical Proof of Collatz Conjecture. Since June 2020 I tried, now it is completed. Initially I sent it to Annals of Pure and Applied Mathematics, India to get reviews. As per initial review, I did some slight modifications and submitted to another Journal The Albertian Journal of Pure and Applied Mathematics, publishing from Research Department of Mathematics, St Alberts College Ernakulam, Kerala. I expect its Review report at 30th July.
I hope the Proof is convincing to Mathematics Graduates and Post Graduates.
An Introduction to Soul Set, IJSER,
Soul Process,
Statistical Soul Process Control,
The Motto Of Sreenarayanaguru in View of Soul Analysis,
Origin IJSER
Soul Set of a Product
The Errors in awarded PhDs in Mathematics from Kerala State India and Errors in Mathematics Text Books prescribed by Indian Universities, etc
are some of my Independent effort to bring a new Branch in Mathematics that helps to connect Mathematics with Humanity.
The Theorem 1 in article Origin, IJSER is very Simple and very effective.
Thanks for Reading.

12 January, 2022 at 7:33 pm

Isaac Bernabé Duarte Alvarenga

I have a conjecture that I think is true and I think it is weak enough to be proven, for all n→ Pn+3—(Pn+2)(Pn+1)/Pn is smaller than 0 and for all n such that n is not divisible by 2→Pn+2—(Pn+1)^2/Pn is bigger than 0 and forma all n such that n is divisible by 2 →Pn+2—(Pn+1)^2/Pn is smaller than 0 that’s all I hope someone who is reading this can solve it.

	Anonymous on 254A, Supplement 4: Probabilis…
	Terence Tao on Analysis II
	Anonymous on Analysis II
	El problema de Erdős… on Two announcements: AI for Math…
	Anonymous on An airport-inspired puzzle
	oliverknill on Two announcements: AI for Math…
	Anonymous on An airport-inspired puzzle
	Prashant Patil on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Two announcements: AI for Math…
	Lior Silberman on Two announcements: AI for Math…

The inclusion-exclusion principle for commuting projections

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

9 comments

Leave a reply to allenknutson Cancel reply

For commenters

The inclusion-exclusion principle for commuting projections

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

9 comments

Leave a reply to allenknutson Cancel reply

For commenters