You are currently browsing the tag archive for the ‘cocycle’ tag.

Let ${G = (G,+)}$, ${H = (H,+)}$ be additive groups (i.e., groups with an abelian addition group law). A map ${f: G \rightarrow H}$ is a homomorphism if one has

$\displaystyle f(x+y) - f(x) - f(y) = 0$

for all ${x,y \in G}$. A map ${f: G \rightarrow H}$ is an affine homomorphism if one has

$\displaystyle f(x_1) - f(x_2) + f(x_3) - f(x_4) = 0 \ \ \ \ \ (1)$

for all additive quadruples ${(x_1,x_2,x_3,x_4)}$ in ${G}$, by which we mean that ${x_1,x_2,x_3,x_4 \in G}$ and ${x_1-x_2+x_3-x_4=0}$. The two notions are closely related; it is easy to verify that ${f}$ is an affine homomorphism if and only if ${f}$ is the sum of a homomorphism and a constant.

Now suppose that ${H}$ also has a translation-invariant metric ${d}$. A map ${f: G \rightarrow H}$ is said to be a quasimorphism if one has

$\displaystyle f(x+y) - f(x) - f(y) = O(1) \ \ \ \ \ (2)$

for all ${x,y \in G}$, where ${O(1)}$ denotes a quantity at a bounded distance from the origin. Similarly, ${f: G \rightarrow H}$ is an affine quasimorphism if

$\displaystyle f(x_1) - f(x_2) + f(x_3) - f(x_4) = O(1) \ \ \ \ \ (3)$

for all additive quadruples ${(x_1,x_2,x_3,x_4)}$ in ${G}$. Again, one can check that ${f}$ is an affine quasimorphism if and only if it is the sum of a quasimorphism and a constant (with the implied constant of the quasimorphism controlled by the implied constant of the affine quasimorphism). (Since every constant is itself a quasimorphism, it is in fact the case that affine quasimorphisms are quasimorphisms, but now the implied constant in the latter is not controlled by the implied constant of the former.)

“Trivial” examples of quasimorphisms include the sum of a homomorphism and a bounded function. Are there others? In some cases, the answer is no. For instance, suppose we have a quasimorphism ${f: {\bf Z} \rightarrow {\bf R}}$. Iterating (2), we see that ${f(kx) = kf(x) + O(k)}$ for any integer ${x}$ and natural number ${k}$, which we can rewrite as ${f(kx)/kx = f(x)/x + O(1/|x|)}$ for non-zero ${x}$. Also, ${f}$ is Lipschitz. Sending ${k \rightarrow \infty}$, we can verify that ${f(x)/x}$ is a Cauchy sequence as ${x \rightarrow \infty}$ and thus tends to some limit ${\alpha}$; we have ${\alpha = f(x)/x + O(1/x)}$ for ${x \geq 1}$, hence ${f(x) = \alpha x + O(1)}$ for positive ${x}$, and then one can use (2) one last time to obtain ${f(x) = \alpha x + O(1)}$ for all ${x}$. Thus ${f}$ is the sum of the homomorphism ${x \mapsto \alpha x}$ and a bounded sequence.

In general, one can phrase this problem in the language of group cohomology (discussed in this previous post). Call a map ${f: G \rightarrow H}$ a ${0}$-cocycle. A ${1}$-cocycle is a map ${\rho: G \times G \rightarrow H}$ obeying the identity

$\displaystyle \rho(x,y+z) + \rho(y,z) = \rho(x,y) + \rho(x+y,z)$

for all ${x,y,z \in G}$. Given a ${0}$-cocycle ${f: G \rightarrow H}$, one can form its derivative ${\partial f: G \times G \rightarrow H}$ by the formula

$\displaystyle \partial f(x,y) := f(x+y)-f(x)-f(y).$

Such functions are called ${1}$-coboundaries. It is easy to see that the abelian group of ${1}$-coboundaries is a subgroup of the abelian group of ${1}$-cocycles. The quotient of these two groups is the first group cohomology of ${G}$ with coefficients in ${H}$, and is denoted ${H^1(G; H)}$.

If a ${0}$-cocycle is bounded then its derivative is a bounded ${1}$-coboundary. The quotient of the group of bounded ${1}$-cocycles by the derivatives of bounded ${0}$-cocycles is called the bounded first group cohomology of ${G}$ with coefficients in ${H}$, and is denoted ${H^1_b(G; H)}$. There is an obvious homomorphism ${\phi}$ from ${H^1_b(G; H)}$ to ${H^1(G; H)}$, formed by taking a coset of the space of derivatives of bounded ${0}$-cocycles, and enlarging it to a coset of the space of ${1}$-coboundaries. By chasing all the definitions, we see that all quasimorphism from ${G}$ to ${H}$ are the sum of a homomorphism and a bounded function if and only if this homomorphism ${\phi}$ is injective; in fact the quotient of the space of quasimorphisms by the sum of homomorphisms and bounded functions is isomorphic to the kernel of ${\phi}$.

In additive combinatorics, one is often working with functions which only have additive structure a fraction of the time, thus for instance (1) or (3) might only hold “${1\%}$ of the time”. This makes it somewhat difficult to directly interpret the situation in terms of group cohomology. However, thanks to tools such as the Balog-Szemerédi-Gowers lemma, one can upgrade this sort of ${1\%}$-structure to ${100\%}$-structure – at the cost of restricting the domain to a smaller set. Here I record one such instance of this phenomenon, thus giving a tentative link between additive combinatorics and group cohomology. (I thank Yuval Wigderson for suggesting the problem of locating such a link.)

Theorem 1 Let ${G = (G,+)}$, ${H = (H,+)}$ be additive groups with ${|G|=N}$, let ${S}$ be a subset of ${H}$, let ${E \subset G}$, and let ${f: E \rightarrow H}$ be a function such that

$\displaystyle f(x_1) - f(x_2) + f(x_3) - f(x_4) \in S$

for ${\geq K^{-1} N^3}$ additive quadruples ${(x_1,x_2,x_3,x_4)}$ in ${E}$. Then there exists a subset ${A}$ of ${G}$ containing ${0}$ with ${|A| \gg K^{-O(1)} N}$, a subset ${X}$ of ${H}$ with ${|X| \ll K^{O(1)}}$, and a function ${g: 4A-4A \rightarrow H}$ such that

$\displaystyle g(x+y) - g(x)-g(y) \in X + 496S - 496S \ \ \ \ \ (4)$

for all ${x, y \in 2A-2A}$ (thus, the derivative ${\partial g}$ takes values in ${X + 496 S - 496 S}$ on ${2A - 2A}$), and such that for each ${h \in A}$, one has

$\displaystyle f(x+h) - f(x) - g(h) \in 8S - 8S \ \ \ \ \ (5)$

for ${\gg K^{-O(1)} N}$ values of ${x \in E}$.

Presumably the constants ${8}$ and ${496}$ can be improved further, but we have not attempted to optimise these constants. We chose ${2A-2A}$ as the domain on which one has a bounded derivative, as one can use the Bogulybov lemma (see e.g, Proposition 4.39 of my book with Van Vu) to find a large Bohr set inside ${2A-2A}$. In applications, the set ${S}$ need not have bounded size, or even bounded doubling; for instance, in the inverse ${U^4}$ theory over a small finite fields ${F}$, one would be interested in the situation where ${H}$ is the group of ${n \times n}$ matrices with coefficients in ${F}$ (for some large ${n}$, and ${S}$ being the subset consisting of those matrices of rank bounded by some bound ${C = O(1)}$.

Proof: By hypothesis, there are ${\geq K N^3}$ triples ${(h,x,y) \in G^3}$ such that ${x,x+h,y,y+h \in E}$ and

$\displaystyle f(x+h) - f(x) \in f(y+h)-f(y) + S. \ \ \ \ \ (6)$

Thus, there is a set ${B \subset G}$ with ${|B| \gg K^{-1} N}$ such that for all ${h \in B}$, one has (6) for ${\gg K^{-1} N^2}$ pairs ${(x,y) \in G^2}$ with ${x,x+h,y,y+h \in E}$; in particular, there exists ${y = y(h) \in E \cap (E-h)}$ such that (6) holds for ${\gg K^{-1} N}$ values of ${x \in E \cap (E-h)}$. Setting ${g_0(h) := f(y(h)+h) - f(y(h))}$, we conclude that for each ${h \in B}$, one has

$\displaystyle f(x+h) - f(x) \in g_0(h) + S \ \ \ \ \ (7)$

for ${\gg K^{-1} N}$ values of ${x \in E \cap (E-h)}$.

Consider the bipartite graph whose vertex sets are two copies of ${E}$, and ${x}$ and ${x+h}$ connected by a (directed) edge if ${h \in B}$ and (7) holds. Then this graph has ${\gg K^{-2} N^2}$ edges. Applying (a slight modification of) the Balog-Szemerédi-Gowers theorem (for instance by modifying the proof of Corollary 5.19 of my book with Van Vu), we can then find a subset ${C}$ of ${E}$ with ${|C| \gg K^{-O(1)} N}$ with the property that for any ${x_1,x_3 \in C}$, there exist ${\gg K^{-O(1)} N^3}$ triples ${(x_2,y_1,y_2) \in E^3}$ such that the edges ${(x_1,y_1), (x_2,y_1), (x_2,y_2), (x_3,y_2)}$ all lie in this bipartite graph. This implies that, for all ${x_1,x_3 \in C}$, there exist ${\gg K^{-O(1)} N^7}$ septuples ${(x_2,y_1,y_2,z_{11},z_{21},z_{22},z_{32}) \in G^7}$ obeying the constraints

$\displaystyle f(y_j) - f(x_i), f(y_j+z_{ij}) - f(x_i+z_{ij}) \in g_0(y_j-x_i) + S$

and ${y_j, x_i, y_j+z_{ij}, x_i+z_{ij} \in E}$ for ${ij = 11, 21, 22, 32}$. These constraints imply in particular that

$\displaystyle f(x_3) - f(x_1) \in f(x_3+z_{32}) - f(y_2+z_{32}) + f(y_2+z_{22}) - f(x_2+z_{22}) + f(x_2+z_{21}) - f(y_1+z_{21}) + f(y_1+z_{11}) - f(x_1+z_{11}) + 4S - 4S.$

Also observe that

$\displaystyle x_3 - x_1 = (x_3+z_{32}) - (y_2+z_{32}) + (y_2+z_{22}) - (x_2+z_{22}) + (x_2+z_{21}) - (y_1+z_{21}) + (y_1+z_{11}) - (x_1+z_{11}).$

Thus, if ${h \in G}$ and ${x_3,x_1 \in C}$ are such that ${x_3-x_1 = h}$, we see that

$\displaystyle f(w_1) - f(w_2) + f(w_3) - f(w_4) + f(w_5) - f(w_6) + f(w_7) - f(w_8) \in f(x_3) - f(x_1) + 4S - 4S$

for ${\gg K^{-O(1)} N^7}$ octuples ${(w_1,w_2,w_3,w_4,w_5,w_6,w_7,w_8) \in E^8}$ in the hyperplane

$\displaystyle h = w_1 - w_2 + w_3 - w_4 + w_5 - w_6 + w_7 - w_8.$

By the pigeonhole principle, this implies that for any fixed ${h \in G}$, there can be at most ${O(K^{O(1)})}$ sets of the form ${f(x_3)-f(x_1) + 3S-3S}$ with ${x_3-x_1=h}$, ${x_1,x_3 \in C}$ that are pairwise disjoint. Using a greedy algorithm, we conclude that there is a set ${W_h}$ of cardinality ${O(K^{O(1)})}$, such that each set ${f(x_3) - f(x_1) + 3S-3S}$ with ${x_3-x_1=h}$, ${x_1,x_3 \in C}$ intersects ${w+4S -4S}$ for some ${w \in W_h}$, or in other words that

$\displaystyle f(x_3) - f(x_1) \in W_{x_3-x_1} + 8S-8S \ \ \ \ \ (8)$

whenever ${x_1,x_3 \in C}$. In particular,

$\displaystyle \sum_{h \in G} \sum_{w \in W_h} | \{ (x_1,x_3) \in C^2: x_3-x_1 = h; f(x_3) - f(x_1) \in w + 8S-8S \}| \geq |C|^2 \gg K^{-O(1)} N^2.$

This implies that there exists a subset ${A}$ of ${G}$ with ${|A| \gg K^{-O(1)} N}$, and an element ${g_1(h) \in W_h}$ for each ${h \in A}$, such that

$\displaystyle | \{ (x_1,x_3) \in C^2: x_3-x_1 = h; f(x_3) - f(x_1) \in g_1(h) + 8S-8S \}| \gg K^{-O(1)} N \ \ \ \ \ (9)$

for all ${h \in A}$. Note we may assume without loss of generality that ${0 \in A}$ and ${g_1(0)=0}$.

Suppose that ${h_1,\dots,h_{16} \in A}$ are such that

$\displaystyle \sum_{i=1}^{16} (-1)^{i-1} h_i = 0. \ \ \ \ \ (10)$

By construction of ${A}$, and permuting labels, we can find ${\gg K^{-O(1)} N^{16}}$ 16-tuples ${(x_1,\dots,x_{16},y_1,\dots,y_{16}) \in C^{32}}$ such that

$\displaystyle y_i - x_i = (-1)^{i-1} h_i$

and

$\displaystyle f(y_i) - f(x_i) \in (-1)^{i-1} g_i(h) + 8S - 8S$

for ${i=1,\dots,16}$. We sum this to obtain

$\displaystyle f(y_1) + \sum_{i=1}^{15} f(y_{i+1})-f(x_i) - f(x_8) \in \sum_{i=1}^{16} (-1)^{i-1} g_1(h_i) + 128 S - 128 S$

and hence by (8)

$\displaystyle f(y_1) - f(x_{16}) + \sum_{i=1}^{15} W_{k_i} \in \sum_{i=1}^{16} (-1)^{i-1} g_1(h_i) + 248 S - 248 S$

where ${k_i := y_{i+1}-x_i}$. Since

$\displaystyle y_1 - x_{16} + \sum_{i=1}^{15} k_i = 0$

we see that there are only ${N^{16}}$ possible values of ${(y_1,x_{16},k_1,\dots,k_{15})}$. By the pigeonhole principle, we conclude that at most ${O(K^{O(1)})}$ of the sets ${\sum_{i=1}^{16} (-1)^i g_1(h_i) + 248 S - 248 S}$ can be disjoint. Arguing as before, we conclude that there exists a set ${X}$ of cardinality ${O(K^{O(1)})}$ such that

$\displaystyle \sum_{i=1}^{16} (-1)^{i-1} g_1(h_i) \in X + 496 S - 496 S \ \ \ \ \ (11)$

whenever (10) holds.

For any ${h \in 4A-4A}$, write ${h}$ arbitrarily as ${h = \sum_{i=1}^8 (-1)^{i-1} h_i}$ for some ${h_1,\dots,h_8 \in A}$ (with ${h_5=\dots=h_8=0}$ if ${h \in 2A-2A}$, and ${h_2 = \dots = h_8 = 0}$ if ${h \in A}$) and then set

$\displaystyle g(h) := \sum_{i=1}^8 (-1)^i g_1(h_i).$

Then from (11) we have (4). For ${h \in A}$ we have ${g(h) = g_1(h)}$, and (5) then follows from (9). $\Box$

The von Neumann ergodic theorem (the Hilbert space version of the mean ergodic theorem) asserts that if ${U: H \rightarrow H}$ is a unitary operator on a Hilbert space ${H}$, and ${v \in H}$ is a vector in that Hilbert space, then one has

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N U^n v = \pi_{H^U} v$

in the strong topology, where ${H^U := \{ w \in H: Uw = w \}}$ is the ${U}$-invariant subspace of ${H}$, and ${\pi_{H^U}}$ is the orthogonal projection to ${H^U}$. (See e.g. these previous lecture notes for a proof.) The same proof extends to more general amenable groups: if ${G}$ is a countable amenable group acting on a Hilbert space ${H}$ by unitary transformations ${g: H \rightarrow H}$, and ${v \in H}$ is a vector in that Hilbert space, then one has

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{|\Phi_N|} \sum_{g \in \Phi_N} gv = \pi_{H^G} v \ \ \ \ \ (1)$

for any Følner sequence ${\Phi_N}$ of ${G}$, where ${H^G := \{ w \in H: gw = w \hbox{ for all }g \in G \}}$ is the ${G}$-invariant subspace. Thus one can interpret ${\pi_{H^G} v}$ as a certain average of elements of the orbit ${Gv := \{ gv: g \in G \}}$ of ${v}$.

I recently discovered that there is a simple variant of this ergodic theorem that holds even when the group ${G}$ is not amenable (or not discrete), using a more abstract notion of averaging:

Theorem 1 (Abstract ergodic theorem) Let ${G}$ be an arbitrary group acting unitarily on a Hilbert space ${H}$, and let ${v}$ be a vector in ${H}$. Then ${\pi_{H^G} v}$ is the element in the closed convex hull of ${Gv := \{ gv: g \in G \}}$ of minimal norm, and is also the unique element of ${H^G}$ in this closed convex hull.

Proof: As the closed convex hull of ${Gv}$ is closed, convex, and non-empty in a Hilbert space, it is a classical fact (see e.g. Proposition 1 of this previous post) that it has a unique element ${F}$ of minimal norm. If ${g F \neq F}$ for some ${g}$, then the midpoint of ${g F}$ and ${F}$ would be in the closed convex hull and be of smaller norm, a contradiction; thus ${F}$ is ${G}$-invariant. To finish the first claim, it suffices to show that ${v-F}$ is orthogonal to every element ${h}$ of ${H^G}$. But if this were not the case for some such ${h}$, we would have ${\langle g v - F, h \rangle = \langle v-F,h\rangle \neq 0}$ for all ${g \in G}$, and thus on taking convex hulls ${\langle F-F,h\rangle = \langle f-F,h\rangle \neq 0}$, a contradiction.

Finally, since ${T_g v - F}$ is orthogonal to ${H^G}$, the same is true for ${F'-F}$ for any ${F'}$ in the closed convex hull of ${Gv}$, and this gives the second claim. $\Box$

This result is due to Alaoglu and Birkhoff. It implies the amenable ergodic theorem (1); indeed, given any ${\varepsilon>0}$, Theorem 1 implies that there is a finite convex combination ${v_\varepsilon}$ of shifts ${gv}$ of ${v}$ which lies within ${\varepsilon}$ (in the ${H}$ norm) to ${\pi_{H^G} v}$. By the triangle inequality, all the averages ${\frac{1}{|\Phi_N|} \sum_{g \in \Phi_N} gv_\varepsilon}$ also lie within ${\varepsilon}$ of ${\pi_{H^G} v}$, but by the Følner property this implies that the averages ${\frac{1}{|\Phi_N|} \sum_{g \in \Phi_N} gv}$ are eventually within ${2\varepsilon}$ (say) of ${\pi_{H^G} v}$, giving the claim.

It turns out to be possible to use Theorem 1 as a substitute for the mean ergodic theorem in a number of contexts, thus removing the need for an amenability hypothesis. Here is a basic application:

Corollary 2 (Relative orthogonality) Let ${G}$ be a group acting unitarily on a Hilbert space ${H}$, and let ${V}$ be a ${G}$-invariant closed subspace of ${H}$. Then ${V}$ and ${H^G}$ are relatively orthogonal over their common subspace ${V^G}$, that is to say the restrictions of ${V}$ and ${H^G}$ to the orthogonal complement of ${V^G}$ are orthogonal to each other.

Proof: By Theorem 1, we have ${\pi_{H^G} v = \pi_{V^G} v}$ for all ${v \in V}$, and the claim follows. (Thanks to Gergely Harcos for this short argument.) $\Box$

Now we give a more advanced application of Theorem 1, to establish some “Mackey theory” over arbitrary groups ${G}$. Define a ${G}$-system ${(X, {\mathcal X}, \mu, (T_g)_{g \in G})}$ to be a probability space ${X = (X, {\mathcal X}, \mu)}$ together with a measure-preserving action ${(T_g)_{g \in G}}$ of ${G}$ on ${X}$; this gives an action of ${G}$ on ${L^2(X) = L^2(X,{\mathcal X},\mu)}$, which by abuse of notation we also call ${T_g}$:

$\displaystyle T_g f := f \circ T_{g^{-1}}.$

(In this post we follow the usual convention of defining the ${L^p}$ spaces by quotienting out by almost everywhere equivalence.) We say that a ${G}$-system is ergodic if ${L^2(X)^G}$ consists only of the constants.

(A technical point: the theory becomes slightly cleaner if we interpret our measure spaces abstractly (or “pointlessly“), removing the underlying space ${X}$ and quotienting ${{\mathcal X}}$ by the ${\sigma}$-ideal of null sets, and considering maps such as ${T_g}$ only on this quotient ${\sigma}$-algebra (or on the associated von Neumann algebra ${L^\infty(X)}$ or Hilbert space ${L^2(X)}$). However, we will stick with the more traditional setting of classical probability spaces here to keep the notation familiar, but with the understanding that many of the statements below should be understood modulo null sets.)

A factor ${Y = (Y, {\mathcal Y}, \nu, (S_g)_{g \in G})}$ of a ${G}$-system ${X = (X,{\mathcal X},\mu, (T_g)_{g \in G})}$ is another ${G}$-system together with a factor map ${\pi: X \rightarrow Y}$ which commutes with the ${G}$-action (thus ${T_g \pi = \pi S_g}$ for all ${g \in G}$) and respects the measure in the sense that ${\mu(\pi^{-1}(E)) = \nu(E)}$ for all ${E \in {\mathcal Y}}$. For instance, the ${G}$-invariant factor ${Z^0_G(X) := (X, {\mathcal X}^G, \mu\downharpoonright_{{\mathcal X}^G}, (T_g)_{g \in G})}$, formed by restricting ${X}$ to the invariant algebra ${{\mathcal X}^G := \{ E \in {\mathcal X}: T_g E = E \hbox{ a.e. for all } g \in G \}}$, is a factor of ${X}$. (This factor is the first factor in an important hierachy, the next element of which is the Kronecker factor ${Z^1_G(X)}$, but we will not discuss higher elements of this hierarchy further here.) If ${Y}$ is a factor of ${X}$, we refer to ${X}$ as an extension of ${Y}$.

From Corollary 2 we have

Corollary 3 (Relative independence) Let ${X}$ be a ${G}$-system for a group ${G}$, and let ${Y}$ be a factor of ${X}$. Then ${Y}$ and ${Z^0_G(X)}$ are relatively independent over their common factor ${Z^0_G(Y)}$, in the sense that the spaces ${L^2(Y)}$ and ${L^2(Z^0_G(X))}$ are relatively orthogonal over ${L^2(Z^0_G(Y))}$ when all these spaces are embedded into ${L^2(X)}$.

This has a simple consequence regarding the product ${X \times Y = (X \times Y, {\mathcal X} \times {\mathcal Y}, \mu \times \nu, (T_g \oplus S_g)_{g \in G})}$ of two ${G}$-systems ${X = (X, {\mathcal X}, \mu, (T_g)_{g \in G})}$ and ${Y = (Y, {\mathcal Y}, \nu, (S_g)_{g \in G})}$, in the case when the ${Y}$ action is trivial:

Lemma 4 If ${X,Y}$ are two ${G}$-systems, with the action of ${G}$ on ${Y}$ trivial, then ${Z^0_G(X \times Y)}$ is isomorphic to ${Z^0_G(X) \times Y}$ in the obvious fashion.

This lemma is immediate for countable ${G}$, since for a ${G}$-invariant function ${f}$, one can ensure that ${T_g f = f}$ holds simultaneously for all ${g \in G}$ outside of a null set, but is a little trickier for uncountable ${G}$.

Proof: It is clear that ${Z^0_G(X) \times Y}$ is a factor of ${Z^0_G(X \times Y)}$. To obtain the reverse inclusion, suppose that it fails, thus there is a non-zero ${f \in L^2(Z^0_G(X \times Y))}$ which is orthogonal to ${L^2(Z^0_G(X) \times Y)}$. In particular, we have ${fg}$ orthogonal to ${L^2(Z^0_G(X))}$ for any ${g \in L^\infty(Y)}$. Since ${fg}$ lies in ${L^2(Z^0_G(X \times Y))}$, we conclude from Corollary 3 (viewing ${X}$ as a factor of ${X \times Y}$) that ${fg}$ is also orthogonal to ${L^2(X)}$. Since ${g}$ is an arbitrary element of ${L^\infty(Y)}$, we conclude that ${f}$ is orthogonal to ${L^2(X \times Y)}$ and in particular is orthogonal to itself, a contradiction. (Thanks to Gergely Harcos for this argument.) $\Box$

Now we discuss the notion of a group extension.

Definition 5 (Group extension) Let ${G}$ be an arbitrary group, let ${Y = (Y, {\mathcal Y}, \nu, (S_g)_{g \in G})}$ be a ${G}$-system, and let ${K}$ be a compact metrisable group. A ${K}$-extension of ${Y}$ is an extension ${X = (X, {\mathcal X}, \mu, (T_g)_{g \in G})}$ whose underlying space is ${X = Y \times K}$ (with ${{\mathcal X}}$ the product of ${{\mathcal Y}}$ and the Borel ${\sigma}$-algebra on ${K}$), the factor map is ${\pi: (y,k) \mapsto y}$, and the shift maps ${T_g}$ are given by

$\displaystyle T_g ( y, k ) = (S_g y, \rho_g(y) k )$

where for each ${g \in G}$, ${\rho_g: Y \rightarrow K}$ is a measurable map (known as the cocycle associated to the ${K}$-extension ${X}$).

An important special case of a ${K}$-extension arises when the measure ${\mu}$ is the product of ${\nu}$ with the Haar measure ${dk}$ on ${K}$. In this case, ${X}$ also has a ${K}$-action ${k': (y,k) \mapsto (y,k(k')^{-1})}$ that commutes with the ${G}$-action, making ${X}$ a ${G \times K}$-system. More generally, ${\mu}$ could be the product of ${\nu}$ with the Haar measure ${dh}$ of some closed subgroup ${H}$ of ${K}$, with ${\rho_g}$ taking values in ${H}$; then ${X}$ is now a ${G \times H}$ system. In this latter case we will call ${X}$ ${H}$-uniform.

If ${X}$ is a ${K}$-extension of ${Y}$ and ${U: Y \rightarrow K}$ is a measurable map, we can define the gauge transform ${X_U}$ of ${X}$ to be the ${K}$-extension of ${Y}$ whose measure ${\mu_U}$ is the pushforward of ${\mu}$ under the map ${(y,k) \mapsto (y, U(y) k)}$, and whose cocycles ${\rho_{g,U}: Y \rightarrow K}$ for ${g \in G}$ are given by the formula

$\displaystyle \rho_{g,U}(y) := U(gy) \rho_g(y) U(y)^{-1}.$

It is easy to see that ${X_U}$ is a ${K}$-extension that is isomorphic to ${X}$ as a ${K}$-extension of ${Y}$; we will refer to ${X_U}$ and ${X}$ as equivalent systems, and ${\rho_{g,U}}$ as cohomologous to ${\rho_g}$. We then have the following fundamental result of Mackey and of Zimmer:

Theorem 6 (Mackey-Zimmer theorem) Let ${G}$ be an arbitrary group, let ${Y}$ be an ergodic ${G}$-system, and let ${K}$ be a compact metrisable group. Then every ergodic ${K}$-extension ${X}$ of ${Y}$ is equivalent to an ${H}$-uniform extension of ${Y}$ for some closed subgroup ${H}$ of ${K}$.

This theorem is usually stated for amenable groups ${G}$, but by using Theorem 1 (or more precisely, Corollary 3) the result is in fact also valid for arbitrary groups; we give the proof below the fold. (In the usual formulations of the theorem, ${X}$ and ${Y}$ are also required to be Lebesgue spaces, or at least standard Borel, but again with our abstract approach here, such hypotheses will be unnecessary.) Among other things, this theorem plays an important role in the Furstenberg-Zimmer structural theory of measure-preserving systems (as well as subsequent refinements of this theory by Host and Kra); see this previous blog post for some relevant discussion. One can obtain similar descriptions of non-ergodic extensions by working relative to the invariant factor (or via the ergodic decomposition, if one has enough separability hypotheses on the system), but the result becomes more complicated to state, and we will not do so here; see this paper of Austin for details.