You are currently browsing the tag archive for the ‘idempotents’ tag.

The (classical) Möbius function ${\mu: {\bf N} \rightarrow {\bf Z}}$ is the unique function that obeys the classical Möbius inversion formula:

Proposition 1 (Classical Möbius inversion) Let ${f,g: {\bf N} \rightarrow A}$ be functions from the natural numbers to an additive group ${A}$. Then the following two claims are equivalent:
• (i) ${f(n) = \sum_{d|n} g(d)}$ for all ${n \in {\bf N}}$.
• (ii) ${g(n) = \sum_{d|n} \mu(n/d) f(d)}$ for all ${n \in {\bf N}}$.

There is a generalisation of this formula to (finite) posets, due to Hall, in which one sums over chains ${n_0 > \dots > n_k}$ in the poset:

Proposition 2 (Poset Möbius inversion) Let ${{\mathcal N}}$ be a finite poset, and let ${f,g: {\mathcal N} \rightarrow A}$ be functions from that poset to an additive group ${A}$. Then the following two claims are equivalent:
• (i) ${f(n) = \sum_{d \leq n} g(d)}$ for all ${n \in {\mathcal N}}$, where ${d}$ is understood to range in ${{\mathcal N}}$.
• (ii) ${g(n) = \sum_{k=0}^\infty (-1)^k \sum_{n = n_0 > n_1 > \dots > n_k} f(n_k)}$ for all ${n \in {\mathcal N}}$, where in the inner sum ${n_0,\dots,n_k}$ are understood to range in ${{\mathcal N}}$ with the indicated ordering.
(Note from the finite nature of ${{\mathcal N}}$ that the inner sum in (ii) is vacuous for all but finitely many ${k}$.)

Comparing Proposition 2 with Proposition 1, it is natural to refer to the function ${\mu(d,n) := \sum_{k=0}^\infty (-1)^k \sum_{n = n_0 > n_1 > \dots > n_k = d} 1}$ as the Möbius function of the poset; the condition (ii) can then be written as

$\displaystyle g(n) = \sum_{d \leq n} \mu(d,n) f(d).$

Proof: If (i) holds, then we have

$\displaystyle g(n) = f(n) - \sum_{d

for any ${n \in {\mathcal N}}$. Iterating this we obtain (ii). Conversely, from (ii) and separating out the ${k=0}$ term, and grouping all the other terms based on the value of ${d:=n_1}$, we obtain (1), and hence (i). $\Box$

In fact it is not completely necessary that the poset ${{\mathcal N}}$ be finite; an inspection of the proof shows that it suffices that every element ${n}$ of the poset has only finitely many predecessors ${\{ d \in {\mathcal N}: d < n \}}$.

It is not difficult to see that Proposition 2 includes Proposition 1 as a special case, after verifying the combinatorial fact that the quantity

$\displaystyle \sum_{k=0}^\infty (-1)^k \sum_{d=n_k | n_{k-1} | \dots | n_1 | n_0 = n} 1$

is equal to ${\mu(n/d)}$ when ${d}$ divides ${n}$, and vanishes otherwise.

I recently discovered that Proposition 2 can also lead to a useful variant of the inclusion-exclusion principle. The classical version of this principle can be phrased in terms of indicator functions: if ${A_1,\dots,A_\ell}$ are subsets of some set ${X}$, then

$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_{k=0}^\ell (-1)^k \sum_{1 \leq j_1 < \dots < j_k \leq \ell} 1_{A_{j_1} \cap \dots \cap A_{j_k}}.$

In particular, if there is a finite measure ${\nu}$ on ${X}$ for which ${A_1,\dots,A_\ell}$ are all measurable, we have

$\displaystyle \nu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_{k=0}^\ell (-1)^k \sum_{1 \leq j_1 < \dots < j_k \leq \ell} \nu( A_{j_1} \cap \dots \cap A_{j_k} ).$

One drawback of this formula is that there are exponentially many terms on the right-hand side: ${2^\ell}$ of them, in fact. However, in many cases of interest there are “collisions” between the intersections ${A_{j_1} \cap \dots \cap A_{j_k}}$ (for instance, perhaps many of the pairwise intersections ${A_i \cap A_j}$ agree), in which case there is an opportunity to collect terms and hopefully achieve some cancellation. It turns out that it is possible to use Proposition 2 to do this, in which one only needs to sum over chains in the resulting poset of intersections:

Proposition 3 (Hall-type inclusion-exclusion principle) Let ${A_1,\dots,A_\ell}$ be subsets of some set ${X}$, and let ${{\mathcal N}}$ be the finite poset formed by intersections of some of the ${A_i}$ (with the convention that ${X}$ is the empty intersection), ordered by set inclusion. Then for any ${E \in {\mathcal N}}$, one has

$\displaystyle 1_E \prod_{F \subsetneq E} (1 - 1_F) = \sum_{k=0}^\ell (-1)^k \sum_{E = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} 1_{E_k} \ \ \ \ \ (2)$

where ${F, E_0,\dots,E_k}$ are understood to range in ${{\mathcal N}}$. In particular (setting ${E}$ to be the empty intersection) if the ${A_j}$ are all proper subsets of ${X}$ then we have

$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_{k=0}^\ell (-1)^k \sum_{X = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} 1_{E_k}. \ \ \ \ \ (3)$

In particular, if there is a finite measure ${\nu}$ on ${X}$ for which ${A_1,\dots,A_\ell}$ are all measurable, we have

$\displaystyle \mu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_{k=0}^\ell (-1)^k \sum_{X = E_0 \supsetneq E_1 \supsetneq \dots \supsetneq E_k} \mu(E_k).$

Using the Möbius function ${\mu}$ on the poset ${{\mathcal N}}$, one can write these formulae as

$\displaystyle 1_E \prod_{F \subsetneq E} (1 - 1_F) = \sum_{F \subseteq E} \mu(F,E) 1_F,$

$\displaystyle \prod_{j=1}^\ell (1-1_{A_j}) = \sum_F \mu(F,X) 1_F$

and

$\displaystyle \nu(X \backslash \bigcup_{j=1}^\ell A_j) = \sum_F \mu(F,X) \nu(F).$

Proof: It suffices to establish (2) (to derive (3) from (2) observe that all the ${F \subsetneq X}$ are contained in one of the ${A_j}$, so the effect of ${1-1_F}$ may be absorbed into ${1 - 1_{A_j}}$). Applying Proposition 2, this is equivalent to the assertion that

$\displaystyle 1_E = \sum_{F \subseteq E} 1_F \prod_{G \subsetneq F} (1 - 1_G)$

for all ${E \in {\mathcal N}}$. But this amounts to the assertion that for each ${x \in E}$, there is precisely one ${F \subseteq E}$ in ${{\mathcal n}}$ with the property that ${x \in F}$ and ${x \not \in G}$ for any ${G \subsetneq F}$ in ${{\mathcal N}}$, namely one can take ${F}$ to be the intersection of all ${G \subseteq E}$ in ${{\mathcal N}}$ such that ${G}$ contains ${x}$. $\Box$

Example 4 If ${A_1,A_2,A_3 \subsetneq X}$ with ${A_1 \cap A_2 = A_1 \cap A_3 = A_2 \cap A_3 = A_*}$, and ${A_1,A_2,A_3,A_*}$ are all distinct, then we have for any finite measure ${\nu}$ on ${X}$ that makes ${A_1,A_2,A_3}$ measurable that

$\displaystyle \nu(X \backslash (A_1 \cup A_2 \cup A_3)) = \nu(X) - \nu(A_1) - \nu(A_2) \ \ \ \ \ (4)$

$\displaystyle - \nu(A_3) - \nu(A_*) + 3 \nu(A_*)$

due to the four chains ${X \supsetneq A_1}$, ${X \supsetneq A_2}$, ${X \supsetneq A_3}$, ${X \supsetneq A_*}$ of length one, and the three chains ${X \supsetneq A_1 \supsetneq A_*}$, ${X \supsetneq A_2 \supsetneq A_*}$, ${X \supsetneq A_3 \supsetneq A_*}$ of length two. Note that this expansion just has six terms in it, as opposed to the ${2^3=8}$ given by the usual inclusion-exclusion formula, though of course one can reduce the number of terms by combining the ${\nu(A_*)}$ factors. This may not seem particularly impressive, especially if one views the term ${3 \mu(A_*)}$ as really being three terms instead of one, but if we add a fourth set ${A_4 \subsetneq X}$ with ${A_i \cap A_j = A_*}$ for all ${1 \leq i < j \leq 4}$, the formula now becomes

$\displaystyle \nu(X \backslash (A_1 \cup A_2 \cup A_3 \cap A_4)) = \nu(X) - \nu(A_1) - \nu(A_2) \ \ \ \ \ (5)$

$\displaystyle - \nu(A_3) - \nu(A_4) - \nu(A_*) + 4 \nu(A_*)$

and we begin to see more cancellation as we now have just seven terms (or ten if we count ${4 \nu(A_*)}$ as four terms) instead of ${2^4 = 16}$ terms.

Example 5 (Variant of Legendre sieve) If ${q_1,\dots,q_\ell > 1}$ are natural numbers, and ${a_1,a_2,\dots}$ is some sequence of complex numbers with only finitely many terms non-zero, then by applying the above proposition to the sets ${A_j := q_j {\bf N}}$ and with ${\nu}$ equal to counting measure weighted by the ${a_n}$ we obtain a variant of the Legendre sieve

$\displaystyle \sum_{n: (n,q_1 \dots q_\ell) = 1} a_n = \sum_{k=0}^\ell (-1)^k \sum_{1 |' d_1 |' \dots |' d_k} \sum_{n: d_k |n} a_n$

where ${d_1,\dots,d_k}$ range over the set ${{\mathcal N}}$ formed by taking least common multiples of the ${q_j}$ (with the understanding that the empty least common multiple is ${1}$), and ${d |' n}$ denotes the assertion that ${d}$ divides ${n}$ but is strictly less than ${n}$. I am curious to know of this version of the Legendre sieve already appears in the literature (and similarly for the other applications of Proposition 2 given here).

If the poset ${{\mathcal N}}$ has bounded depth then the number of terms in Proposition 3 can end up being just polynomially large in ${\ell}$ rather than exponentially large. Indeed, if all chains ${X \supsetneq E_1 \supsetneq \dots \supsetneq E_k}$ in ${{\mathcal N}}$ have length ${k}$ at most ${k_0}$ then the number of terms here is at most ${1 + \ell + \dots + \ell^{k_0}}$. (The examples (4), (5) are ones in which the depth is equal to two.) I hope to report in a later post on how this version of inclusion-exclusion with polynomially many terms can be useful in an application.

Actually in our application we need an abstraction of the above formula, in which the indicator functions are replaced by more abstract idempotents:

Proposition 6 (Hall-type inclusion-exclusion principle for idempotents) Let ${A_1,\dots,A_\ell}$ be pairwise commuting elements of some ring ${R}$ with identity, which are all idempotent (thus ${A_j A_j = A_j}$ for ${j=1,\dots,\ell}$). Let ${{\mathcal N}}$ be the finite poset formed by products of the ${A_i}$ (with the convention that ${1}$ is the empty product), ordered by declaring ${E \leq F}$ when ${EF = E}$ (note that all the elements of ${{\mathcal N}}$ are idempotent so this is a partial ordering). Then for any ${E \in {\mathcal N}}$, one has

$\displaystyle E \prod_{F < E} (1-F) = \sum_{k=0}^\ell (-1)^k \sum_{E = E_0 > E_1 > \dots > E_k} E_k. \ \ \ \ \ (6)$

where ${F, E_0,\dots,E_k}$ are understood to range in ${{\mathcal N}}$. In particular (setting ${E=1}$) if all the ${A_j}$ are not equal to ${1}$ then we have

$\displaystyle \prod_{j=1}^\ell (1-A_j) = \sum_{k=0}^\ell (-1)^k \sum_{1 = E_0 > E_1 > \dots > E_k} E_k.$

Morally speaking this proposition is equivalent to the previous one after applying a “spectral theorem” to simultaneously diagonalise all of the ${A_j}$, but it is quicker to just adapt the previous proof to establish this proposition directly. Using the Möbius function ${\mu}$ for ${{\mathcal N}}$, we can rewrite these formulae as

$\displaystyle E \prod_{F < E} (1-F) = \sum_{F \leq E} \mu(F,E) 1_F$

and

$\displaystyle \prod_{j=1}^\ell (1-A_j) = \sum_F \mu(F,1) 1_F.$

Proof: Again it suffices to verify (6). Using Proposition 2 as before, it suffices to show that

$\displaystyle E = \sum_{F \leq E} F \prod_{G < F} (1 - G) \ \ \ \ \ (7)$

for all ${E \in {\mathcal N}}$ (all sums and products are understood to range in ${{\mathcal N}}$). We can expand

$\displaystyle E = E \prod_{G < E} (G + (1-G)) = \sum_{{\mathcal A}} (\prod_{G \in {\mathcal A}} G) (\prod_{G < E: G \not \in {\mathcal A}} (1-G)) \ \ \ \ \ (8)$

where ${{\mathcal A}}$ ranges over all subsets of ${\{ G \in {\mathcal N}: G \leq E \}}$ that contain ${E}$. For such an ${{\mathcal A}}$, if we write ${F := \prod_{G \in {\mathcal A}} G}$, then ${F}$ is the greatest lower bound of ${{\mathcal A}}$, and we observe that ${F (\prod_{G < E: G \not \in {\mathcal A}} (1-G))}$ vanishes whenever ${{\mathcal A}}$ fails to contain some ${G \in {\mathcal N}}$ with ${F \leq G \leq E}$. Thus the only ${{\mathcal A}}$ that give non-zero contributions to (8) are the intervals of the form ${\{ G \in {\mathcal N}: F \leq G \leq E\}}$ for some ${F \leq E}$ (which then forms the greatest lower bound for that interval), and the claim (7) follows (after noting that ${F (1-G) = F (1-FG)}$ for any ${F,G \in {\mathcal N}}$). $\Box$

A finite group ${G=(G,\cdot)}$ is said to be a Frobenius group if there is a non-trivial subgroup ${H}$ of ${G}$ (known as the Frobenius complement of ${G}$) such that the conjugates ${gHg^{-1}}$ of ${H}$ are “disjoint as possible” in the sense that ${H \cap gHg^{-1} = \{1\}}$ whenever ${g \not \in H}$. This gives a decomposition

$\displaystyle G = \bigcup_{gH \in G/H} (gHg^{-1} \backslash \{1\}) \cup K \ \ \ \ \ (1)$

where the Frobenius kernel ${K}$ of ${G}$ is defined as the identity element ${1}$ together with all the non-identity elements that are not conjugate to any element of ${H}$. Taking cardinalities, we conclude that

$\displaystyle |G| = \frac{|G|}{|H|} (|H| - 1) + |K|$

and hence

$\displaystyle |H| |K| = |G|. \ \ \ \ \ (2)$

A remarkable theorem of Frobenius gives an unexpected amount of structure on ${K}$ and hence on ${G}$:

Theorem 1 (Frobenius’ theorem) Let ${G}$ be a Frobenius group with Frobenius complement ${H}$ and Frobenius kernel ${K}$. Then ${K}$ is a normal subgroup of ${G}$, and hence (by (2) and the disjointness of ${H}$ and ${K}$ outside the identity) ${G}$ is the semidirect product ${K \rtimes H}$ of ${H}$ and ${K}$.

I discussed Frobenius’ theorem and its proof in this recent blog post. This proof uses the theory of characters on a finite group ${G}$, in particular relying on the fact that a character on a subgroup ${H}$ can induce a character on ${G}$, which can then be decomposed into irreducible characters with natural number coefficients. Remarkably, even though a century has passed since Frobenius’ original argument, there is no proof known of this theorem which avoids character theory entirely; there are elementary proofs known when the complement ${H}$ has even order or when ${H}$ is solvable (we review both of these cases below the fold), which by the Feit-Thompson theorem does cover all the cases, but the proof of the Feit-Thompson theorem involves plenty of character theory (and also relies on Theorem 1). (The answers to this MathOverflow question give a good overview of the current state of affairs.)

I have been playing around recently with the problem of finding a character-free proof of Frobenius’ theorem. I didn’t succeed in obtaining a completely elementary proof, but I did find an argument which replaces character theory (which can be viewed as coming from the representation theory of the non-commutative group algebra ${{\bf C} G \equiv L^2(G)}$) with the Fourier analysis of class functions (i.e. the representation theory of the centre ${Z({\bf C} G) \equiv L^2(G)^G}$ of the group algebra), thus replacing non-commutative representation theory by commutative representation theory. This is not a particularly radical depature from the existing proofs of Frobenius’ theorem, but it did seem to be a new proof which was technically “character-free” (even if it was not all that far from character-based in spirit), so I thought I would record it here.

The main ideas are as follows. The space ${L^2(G)^G}$ of class functions can be viewed as a commutative algebra with respect to the convolution operation ${*}$; as the regular representation is unitary and faithful, this algebra contains no nilpotent elements. As such, (Gelfand-style) Fourier analysis suggests that one can analyse this algebra through the idempotents: class functions ${\phi}$ such that ${\phi*\phi = \phi}$. In terms of characters, idempotents are nothing more than sums of the form ${\sum_{\chi \in \Sigma} \chi(1) \chi}$ for various collections ${\Sigma}$ of characters, but we can perform a fair amount of analysis on idempotents directly without recourse to characters. In particular, it turns out that idempotents enjoy some important integrality properties that can be established without invoking characters: for instance, by taking traces one can check that ${\phi(1)}$ is a natural number, and more generally we will show that ${{\bf E}_{(a,b) \in S} {\bf E}_{x \in G} \phi( a x b^{-1} x^{-1} )}$ is a natural number whenever ${S}$ is a subgroup of ${G \times G}$ (see Corollary 4 below). For instance, the quantity

$\displaystyle \hbox{rank}(\phi) := {\bf E}_{a \in G} {\bf E}_{x \in G} \phi(a xa^{-1} x^{-1})$

is a natural number which we will call the rank of ${\phi}$ (as it is also the linear rank of the transformation ${f \mapsto f*\phi}$ on ${L^2(G)}$).

In the case that ${G}$ is a Frobenius group with kernel ${K}$, the above integrality properties can be used after some elementary manipulations to establish that for any idempotent ${\phi}$, the quantity

$\displaystyle \frac{1}{|G|} \sum_{a \in K} {\bf E}_{x \in G} \phi( axa^{-1}x^{-1} ) - \frac{1}{|G| |K|} \sum_{a,b \in K} \phi(ab^{-1}) \ \ \ \ \ (3)$

is an integer. On the other hand, one can also show by elementary means that this quantity lies between ${0}$ and ${\hbox{rank}(\phi)}$. These two facts are not strong enough on their own to impose much further structure on ${\phi}$, unless one restricts attention to minimal idempotents ${\phi}$. In this case spectral theory (or Gelfand theory, or the fundamental theorem of algebra) tells us that ${\phi}$ has rank one, and then the integrality gap comes into play and forces the quantity (3) to always be either zero or one. This can be used to imply that the convolution action of every minimal idempotent ${\phi}$ either preserves ${\frac{|G|}{|K|} 1_K}$ or annihilates it, which makes ${\frac{|G|}{|K|} 1_K}$ itself an idempotent, which makes ${K}$ normal.

In this lecture, we use topological dynamics methods to prove some other Ramsey-type theorems, and more specifically the polynomial van der Waerden theorem, the hypergraph Ramsey theorem, Hindman’s theorem, and the Hales-Jewett theorem. In proving these statements, I have decided to focus on the ultrafilter-based proofs, rather than the combinatorial or topological proofs, though of course these styles of proof are also available for each of the above theorems.