You are currently browsing the monthly archive for May 2013.
A finite group is said to be a Frobenius group if there is a non-trivial subgroup
of
(known as the Frobenius complement of
) such that the conjugates
of
are “disjoint as possible” in the sense that
whenever
. This gives a decomposition
where the Frobenius kernel of
is defined as the identity element
together with all the non-identity elements that are not conjugate to any element of
. Taking cardinalities, we conclude that
A remarkable theorem of Frobenius gives an unexpected amount of structure on and hence on
:
Theorem 1 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
and Frobenius kernel
. Then
is a normal subgroup of
, and hence (by (2) and the disjointness of
and
outside the identity)
is the semidirect product
of
and
.
I discussed Frobenius’ theorem and its proof in this recent blog post. This proof uses the theory of characters on a finite group , in particular relying on the fact that a character on a subgroup
can induce a character on
, which can then be decomposed into irreducible characters with natural number coefficients. Remarkably, even though a century has passed since Frobenius’ original argument, there is no proof known of this theorem which avoids character theory entirely; there are elementary proofs known when the complement
has even order or when
is solvable (we review both of these cases below the fold), which by the Feit-Thompson theorem does cover all the cases, but the proof of the Feit-Thompson theorem involves plenty of character theory (and also relies on Theorem 1). (The answers to this MathOverflow question give a good overview of the current state of affairs.)
I have been playing around recently with the problem of finding a character-free proof of Frobenius’ theorem. I didn’t succeed in obtaining a completely elementary proof, but I did find an argument which replaces character theory (which can be viewed as coming from the representation theory of the non-commutative group algebra ) with the Fourier analysis of class functions (i.e. the representation theory of the centre
of the group algebra), thus replacing non-commutative representation theory by commutative representation theory. This is not a particularly radical depature from the existing proofs of Frobenius’ theorem, but it did seem to be a new proof which was technically “character-free” (even if it was not all that far from character-based in spirit), so I thought I would record it here.
The main ideas are as follows. The space of class functions can be viewed as a commutative algebra with respect to the convolution operation
; as the regular representation is unitary and faithful, this algebra contains no nilpotent elements. As such, (Gelfand-style) Fourier analysis suggests that one can analyse this algebra through the idempotents: class functions
such that
. In terms of characters, idempotents are nothing more than sums of the form
for various collections
of characters, but we can perform a fair amount of analysis on idempotents directly without recourse to characters. In particular, it turns out that idempotents enjoy some important integrality properties that can be established without invoking characters: for instance, by taking traces one can check that
is a natural number, and more generally we will show that
is a natural number whenever
is a subgroup of
(see Corollary 4 below). For instance, the quantity
is a natural number which we will call the rank of (as it is also the linear rank of the transformation
on
).
In the case that is a Frobenius group with kernel
, the above integrality properties can be used after some elementary manipulations to establish that for any idempotent
, the quantity
is an integer. On the other hand, one can also show by elementary means that this quantity lies between and
. These two facts are not strong enough on their own to impose much further structure on
, unless one restricts attention to minimal idempotents
. In this case spectral theory (or Gelfand theory, or the fundamental theorem of algebra) tells us that
has rank one, and then the integrality gap comes into play and forces the quantity (3) to always be either zero or one. This can be used to imply that the convolution action of every minimal idempotent
either preserves
or annihilates it, which makes
itself an idempotent, which makes
normal.
Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our joint paper “Multiple recurrence and convergence results associated to -actions“. This paper is primarily concerned with limit formulae in the theory of multiple recurrence in ergodic theory. Perhaps the most basic formula of this type is the mean ergodic theorem, which (among other things) asserts that if
is a measure-preserving
-system (which, in this post, means that
is a probability space and
is measure-preserving and invertible, thus giving an action
of the integers), and
are functions, and
is ergodic (which means that
contains no
-invariant functions other than the constants (up to almost everywhere equivalence, of course)), then the average
converges as to the expression
see e.g. this previous blog post. Informally, one can interpret this limit formula as an equidistribution result: if is drawn at random from
(using the probability measure
), and
is drawn at random from
for some large
, then the pair
becomes uniformly distributed in the product space
(using product measure
) in the limit as
.
If we allow to be non-ergodic, then we still have a limit formula, but it is a bit more complicated. Let
be the
-invariant measurable sets in
; the
-system
can then be viewed as a factor of the original system
, which is equivalent (in the sense of measure-preserving systems) to a trivial system
(known as the invariant factor) in which the shift is trivial. There is then a projection map
to the invariant factor which is a factor map, and the average (1) converges in the limit to the expression
where is the pushforward map associated to the map
; see e.g. this previous blog post. We can interpret this as an equidistribution result. If
is a pair as before, then we no longer expect complete equidistribution in
in the non-ergodic, because there are now non-trivial constraints relating
with
; indeed, for any
-invariant function
, we have the constraint
; putting all these constraints together we see that
(for almost every
, at least). The limit (2) can be viewed as an assertion that this constraint
are in some sense the “only” constraints between
and
, and that the pair
is uniformly distributed relative to these constraints.
Limit formulae are known for multiple ergodic averages as well, although the statement becomes more complicated. For instance, consider the expression
for three functions ; this is analogous to the combinatorial task of counting length three progressions in various sets. For simplicity we assume the system
to be ergodic. Naively one might expect this limit to then converge to
which would roughly speaking correspond to an assertion that the triplet is asymptotically equidistributed in
. However, even in the ergodic case there can be additional constraints on this triplet that cannot be seen at the level of the individual pairs
,
. The key obstruction here is that of eigenfunctions of the shift
, that is to say non-trivial functions
that obey the eigenfunction equation
almost everywhere for some constant (or
-invariant)
. Each such eigenfunction generates a constraint
tying together ,
, and
. However, it turns out that these are in some sense the only constraints on
that are relevant for the limit (3). More precisely, if one sets
to be the sub-algebra of
generated by the eigenfunctions of
, then it turns out that the factor
is isomorphic to a shift system
known as the Kronecker factor, for some compact abelian group
and some (irrational) shift
; the factor map
pushes eigenfunctions forward to (affine) characters on
. It is then known that the limit of (3) is
where is the closed subgroup
and is the Haar probability measure on
; see this previous blog post. The equation
defining
corresponds to the constraint (4) mentioned earlier. Among other things, this limit formula implies Roth’s theorem, which in the context of ergodic theory is the assertion that the limit (or at least the limit inferior) of (3) is positive when
is non-negative and not identically vanishing.
If one considers a quadruple average
(analogous to counting length four progressions) then the situation becomes more complicated still, even in the ergodic case. In addition to the (linear) eigenfunctions that already showed up in the computation of the triple average (3), a new type of constraint also arises from quadratic eigenfunctions , which obey an eigenfunction equation
in which
is no longer constant, but is now a linear eigenfunction. For such functions,
behaves quadratically in
, and one can compute the existence of a constraint
between ,
,
, and
that is not detected at the triple average level. As it turns out, this is not the only type of constraint relevant for (5); there is a more general class of constraint involving two-step nilsystems which we will not detail here, but see e.g. this previous blog post for more discussion. Nevertheless there is still a similar limit formula to previous examples, involving a special factor
which turns out to be an inverse limit of two-step nilsystems; this limit theorem can be extracted from the structural theory in this paper of Host and Kra combined with a limit formula for nilsystems obtained by Lesigne, but will not be reproduced here. The pattern continues to higher averages (and higher step nilsystems); this was first done explicitly by Ziegler, and can also in principle be extracted from the structural theory of Host-Kra combined with nilsystem equidistribution results of Leibman. These sorts of limit formulae can lead to various recurrence results refining Roth’s theorem in various ways; see this paper of Bergelson, Host, and Kra for some examples of this.
The above discussion was concerned with -systems, but one can adapt much of the theory to measure-preserving
-systems for other discrete countable abelian groups
, in which one now has a family
of shifts indexed by
rather than a single shift, obeying the compatibility relation
. The role of the intervals
in this more general setting is replaced by that of Folner sequences. For arbitrary countable abelian
, the theory for double averages (1) and triple limits (3) is essentially identical to the
-system case. But when one turns to quadruple and higher limits, the situation becomes more complicated (and, for arbitrary
, still not fully understood). However one model case which is now well understood is the finite field case when
is an infinite-dimensional vector space over a finite field
(with the finite subspaces
then being a good choice for the Folner sequence). Here, the analogue of the structural theory of Host and Kra was worked out by Vitaly, Tamar, and myself in these previous papers (treating the high characteristic and low characteristic cases respectively). In the finite field setting, it turns out that nilsystems no longer appear, and one only needs to deal with linear, quadratic, and higher order eigenfunctions (known collectively as phase polynomials). It is then natural to look for a limit formula that asserts, roughly speaking, that if
is drawn at random from a
-system and
drawn randomly from a large subspace of
, then the only constraints between
are those that arise from phase polynomials. The main theorem of this paper is to establish this limit formula (which, again, is a little complicated to state explicitly and will not be done here). In particular, we establish for the first time that the limit actually exists (a result which, for
-systems, was one of the main results of this paper of Host and Kra).
As a consequence, we can recover finite field analogues of most of the results of Bergelson-Host-Kra, though interestingly some of the counterexamples demonstrating sharpness of their results for -systems (based on Behrend set constructions) do not seem to be present in the finite field setting (cf. this previous blog post on the cap set problem). In particular, we are able to largely settle the question of when one has a Khintchine-type theorem that asserts that for any measurable set
in an ergodic
-system and any
, one has
for a syndetic set of , where
are distinct residue classes. It turns out that Khintchine-type theorems always hold for
(and for
ergodicity is not required), and for
it holds whenever
form a parallelogram, but not otherwise (though the counterexample here was such a painful computation that we ended up removing it from the paper, and may end up putting it online somewhere instead), and for larger
we could show that the Khintchine property failed for generic choices of
, though the problem of determining exactly the tuples for which the Khintchine property failed looked to be rather messy and we did not completely settle it.
[This guest post is authored by Ingrid Daubechies, who is the current president of the International Mathematical Union, and (as she describes below) is heavily involved in planning for a next-generation digital mathematical library that can go beyond the current network of preprint servers (such as the arXiv), journal web pages, article databases (such as MathSciNet), individual author web pages, and general web search engines to create a more integrated and useful mathematical resource. I have lightly edited the post for this blog, mostly by adding additional hyperlinks. – T.]
This guest blog entry concerns the many roles a World Digital Mathematical Library (WDML) could play for the mathematical community worldwide. We seek input to help sketch how a WDML could be so much more than just a huge collection of digitally available mathematical documents. If this is of interest to you, please read on!
The “we” seeking input are the Committee on Electronic Information and Communication (CEIC) of the International Mathematical Union (IMU), and a special committee of the US National Research Council (NRC), charged by the Sloan Foundation to look into this matter. In the US, mathematicians may know the Sloan Foundation best for the prestigious early-career fellowships it awards annually, but the foundation plays a prominent role in other disciplines as well. For instance, the Sloan Digital Sky Survey (SDSS) has had a profound impact on astronomy, serving researchers in many more ways than even its ambitious original setup foresaw. The report being commissioned by the Sloan Foundation from the NRC study group could possibly be the basis for an equally ambitious program funded by the Sloan Foundation for a WDML with the potential to change the practice of mathematical research as profoundly as the SDSS did in astronomy. But to get there, we must formulate a vision that, like the original SDSS proposal, imagines at least some of those impacts. The members of the NRC committee are extremely knowledgeable, and have been picked judiciously so as to span collectively a wide range of expertise and connections. As president of the IMU, I was asked to co-chair this committee, together with Clifford Lynch, of the Coalition for Networked Information; Peter Olver, chair of the IMU’s CEIC, is also a member of the committee. But each of us is at least a quarter century older than the originators of MathOverflow or the ArXiv when they started. We need you, internet-savvy, imaginative, social-networking, young mathematicians to help us formulate the vision that may inspire the creation of a truly revolutionary WDML!
Some history first. Several years ago, an international initiative was started to create a World Digital Mathematical Library. The website for this library, hosted by the IMU, is now mostly a “ghost” website — nothing has been posted there for the last seven years. [It does provide useful links, however, to many sites that continue to be updated, such as the European Mathematical Information Service, which in turn links to many interesting journals, books and other websites featuring electronically available mathematical publications. So it is still worth exploring …] Many of the efforts towards building (parts of) the WDML as originally envisaged have had to grapple with business interests, copyright agreements, search obstructions, metadata secrecy, … and many an enterprising, idealistic effort has been slowly ground down by this. We are still dealing with these frustrations — as witnessed by, e.g., the CostofKnowledge initiative. They are real, important issues, and will need to be addressed.
The charge of the NRC committee, however, is to NOT focus on issues of copyright or open-access or who bears the cost of publishing, but instead on what could/can be done with documents that are (or once they are) freely electronically accessible, apart from simply finding and downloading them. Earlier this year, I posted a question about one possible use on MathOverflow and then on MathForge, about the possibility to “enrich” a paper by annotations from readers, which other readers could wish to consult (or not). These posts elicited some very useful comments. But this was but one way in which a WDML could be more than just an opportunity to find and download papers. Surely there are many more, that you, bloggers and blog-readers, can imagine, suggest, sketch. This is an opportunity: can we — no, YOU! — formulate an ambitious setup that would capture the imagination of sufficiently many of us, that would be workable and that would really make a difference?
Suppose that is a finite group of even order, thus
is a multiple of two. By Cauchy’s theorem, this implies that
contains an involution: an element
in
of order two. (Indeed, if no such involution existed, then
would be partitioned into doubletons
together with the identity, so that
would be odd, a contradiction.) Of course, groups of odd order have no involutions
, thanks to Lagrange’s theorem (since
cannot split into doubletons
).
The classical Brauer-Fowler theorem asserts that if a group has many involutions, then it must have a large non-trivial subgroup:
Theorem 1 (Brauer-Fowler theorem) Let
be a finite group with at least
involutions for some
. Then
contains a proper subgroup
of index at most
.
This theorem (which is Theorem 2F in the original paper of Brauer and Fowler, who in fact manage to sharpen slightly to
) has a number of quick corollaries which are also referred to as “the” Brauer-Fowler theorem. For instance, if
is a an involution of a group
, and the centraliser
has order
, then clearly
(as
contains
and
) and the conjugacy class
has order
(since the map
has preimages that are cosets of
). Every conjugate of an involution is again an involution, so by the Brauer-Fowler theorem
contains a subgroup of order at least
. In particular, we can conclude that every group
of even order contains a proper subgroup of order at least
.
Another corollary is that the size of a simple group of even order can be controlled by the size of a centraliser of one of its involutions:
Corollary 2 (Brauer-Fowler theorem) Let
be a finite simple group with an involution
, and suppose that
has order
. Then
has order at most
.
Indeed, by the previous discussion has a proper subgroup
of index less than
, which then gives a non-trivial permutation action of
on the coset space
. The kernel of this action is a proper normal subgroup of
and is thus trivial, so the action is faithful, and the claim follows.
If one assumes the Feit-Thompson theorem that all groups of odd order are solvable, then Corollary 2 suggests a strategy (first proposed by Brauer himself in 1954) to prove the classification of finite simple groups (CFSG) by induction on the order of the group. Namely, assume for contradiction that the CFSG failed, so that there is a counterexample of minimal order
to the classification. This is a non-abelian finite simple group; by the Feit-Thompson theorem, it has even order and thus has at least one involution
. Take such an involution and consider its centraliser
; this is a proper subgroup of
of some order
. As
is a minimal counterexample to the classification, one can in principle describe
in terms of the CFSG by factoring the group into simple components (via a composition series) and applying the CFSG to each such component. Now, the “only” thing left to do is to verify, for each isomorphism class of
, that all the possible simple groups
that could have this type of group as a centraliser of an involution obey the CFSG; Corollary 2 tells us that for each such isomorphism class for
, there are only finitely many
that could generate this class for one of its centralisers, so this task should be doable in principle for any given isomorphism class for
. That’s all one needs to do to prove the classification of finite simple groups!
Needless to say, this program turns out to be far more difficult than the above summary suggests, and the actual proof of the CFSG does not quite proceed along these lines. However, a significant portion of the argument is based on a generalisation of this strategy, in which the concept of a centraliser of an involution is replaced by the more general notion of a normaliser of a -group, and one studies not just a single normaliser but rather the entire family of such normalisers and how they interact with each other (and in particular, which normalisers of
-groups commute with each other), motivated in part by the theory of Tits buildings for Lie groups which dictates a very specific type of interaction structure between these
-groups in the key case when
is a (sufficiently high rank) finite simple group of Lie type over a field of characteristic
. See the text of Aschbacher, Lyons, Smith, and Solomon for a more detailed description of this strategy.
The Brauer-Fowler theorem can be proven by a nice application of character theory, of the type discussed in this recent blog post, ultimately based on analysing the alternating tensor power of representations; I reproduce a version of this argument (taken from this text of Isaacs) below the fold. (The original argument of Brauer and Fowler is more combinatorial in nature.) However, I wanted to record a variant of the argument that relies not on the fine properties of characters, but on the cruder theory of quasirandomness for groups, the modern study of which was initiated by Gowers, and is discussed for instance in this previous post. It gives the following slightly weaker version of Corollary 2:
Corollary 3 (Weak Brauer-Fowler theorem) Let
be a finite simple group with an involution
, and suppose that
has order
. Then
can be identified with a subgroup of the unitary group
.
One can get an upper bound on from this corollary using Jordan’s theorem, but the resulting bound is a bit weaker than that in Corollary 2 (and the best bounds on Jordan’s theorem require the CFSG!).
Proof: Let be the set of all involutions in
, then as discussed above
. We may assume that
has no non-trivial unitary representation of dimension less than
(since such representations are automatically faithful by the simplicity of
); thus, in the language of quasirandomness,
is
-quasirandom, and is also non-abelian. We have the basic convolution estimate
(see Exercise 10 from this previous blog post). In particular,
and so there are at least pairs
such that
, i.e. involutions
whose product is also an involution. But any such involutions necessarily commute, since
Thus there are at least pairs
of non-identity elements that commute, so by the pigeonhole principle there is a non-identity
whose centraliser
has order at least
. This centraliser cannot be all of
since this would make
central which contradicts the non-abelian simple nature of
. But then the quasiregular representation of
on
has dimension at most
, contradicting the quasirandomness.
Recent Comments