You are currently browsing the tag archive for the ‘characters’ tag.

As in previous posts, we use the following asymptotic notation: {x} is a parameter going off to infinity, and all quantities may depend on {x} unless explicitly declared to be “fixed”. The asymptotic notation {O(), o(), \ll} is then defined relative to this parameter. A quantity {q} is said to be of polynomial size if one has {q = O(x^{O(1)})}, and said to be bounded if {q=O(1)}. Another convenient notation: we write {X \lessapprox Y} for {X \ll x^{o(1)} Y}. Thus for instance the divisor bound asserts that if {q} has polynomial size, then the number of divisors of {q} is {\lessapprox 1}.

This post is intended to highlight a phenomenon unearthed in the ongoing polymath8 project (and is in fact a key component of Zhang’s proof that there are bounded gaps between primes infinitely often), namely that one can get quite good bounds on relatively short exponential sums when the modulus {q} is smooth, through the basic technique of Weyl differencing (ultimately based on the Cauchy-Schwarz inequality, and also related to the van der Corput lemma in equidistribution theory). Improvements in the case of smooth moduli have appeared before in the literature (e.g. in this paper of Heath-Brown, paper of Graham and Ringrose, this later paper of Heath-Brown, this paper of Chang, or this paper of Goldmakher); the arguments here are particularly close to that of the first paper of Heath-Brown. It now also appears that further optimisation of this Weyl differencing trick could lead to noticeable improvements in the numerology for the polymath8 project, so I am devoting this post to explaining this trick further.

To illustrate the method, let us begin with the classical problem in analytic number theory of estimating an incomplete character sum

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n)

where {\chi} is a primitive Dirichlet character of some conductor {q}, {M} is an integer, and {N} is some quantity between {1} and {q}. Clearly we have the trivial bound

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq N; \ \ \ \ \ (1)

we also have the classical Pólya-Vinogradov inequality

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll q^{1/2} \log q. \ \ \ \ \ (2)

This latter inequality gives improvements over the trivial bound when {N} is much larger than {q^{1/2}}, but not for {N} much smaller than {q^{1/2}}. The Pólya-Vinogradov inequality can be deduced via a little Fourier analysis from the completed exponential sum bound

\displaystyle  | \sum_{n \in {\bf Z}/q{\bf Z}} \chi(n) e_q( an )| \ll q^{1/2}

for any {a \in {\bf Z}/q{\bf Z}}, where {e_q(n) :=e^{2\pi i n/q}}. (In fact, from the classical theory of Gauss sums, this exponential sum is equal to {\tau(\chi) \overline{\chi(a)}} for some complex number {\tau(\chi)} of norm {\sqrt{q}}.)

In the case when {q} is a prime, improving upon the above two inequalities is an important but difficult problem, with only partially satisfactory results so far. To give just one indication of the difficulty, the seemingly modest improvement

\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll p^{1/2} \log \log p

to the Pólya-Vinogradov inequality when {q=p} was a prime required a 14-page paper in Inventiones by Montgomery and Vaughan to prove, and even then it was only conditional on the generalised Riemann hypothesis! See also this more recent paper of Granville and Soundararajan for an unconditional variant of this result in the case that {\chi} has odd order.

Another important improvement is the Burgess bound, which in our notation asserts that

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \lessapprox N^{1-1/r} q^{\frac{r+1}{4r^2}} \ \ \ \ \ (3)

for any fixed integer {r \geq 2}, assuming that {q} is square-free (for simplicity) and of polynomial size; see this previous post for a discussion of the Burgess argument. This is non-trivial for {N} as small as {q^{1/4+o(1)}}.

In the case when {q} is prime, there has been very little improvement to the Burgess bound (or its Fourier dual, which can give bounds for {N} as large as {q^{3/4-o(1)}}) in the last fifty years; an improvement to the exponents in (3) in this case (particularly anything that gave a power saving for {N} below {q^{1/4}}) would in fact be rather significant news in analytic number theory.

However, in the opposite case when {q} is smooth – that is to say, all of its factors are much smaller than {q} – then one can do better than the Burgess bound in some regimes. This fact has been observed in several places in the literature (in particular, in the papers of Heath-Brown, Graham-Ringrose, Chang, and Goldmakher mentioned previously), but also turns out to (implicitly) be a key insight in Zhang’s paper on bounded prime gaps. In the case of character sums, one such improved estimate (closely related to Theorem 2 of the Heath-Brown paper) is as follows:

Proposition 1 Let {q} be square-free with a factorisation {q = q_1 q_2} and of polynomial size, and let {M,N} be integers with {1 \leq N \leq q}. Then for any primitive character {\chi} with conductor {q}, one has

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox N^{1/2} q_1^{1/2} + N^{1/2} q_2^{1/4}.

This proposition is particularly powerful when {q} is smooth, as this gives many factorisations {q = q_1 q_2} with the ability to specify {q_1,q_2} with a fair amount of accuracy. For instance, if {q} is {y}-smooth (i.e. all prime factors are at most {y}), then by the greedy algorithm one can find a divisor {q_1} of {q} with {y^{-2/3} q^{1/3} \leq q_1 \leq y^{1/3} q^{1/3}}; if we set {q_2 := q/q_1}, then {y^{-1/3} q^{2/3} \leq q_2 \leq y^{2/3} q^{2/3}}, and the above proposition then gives

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox y^{1/6} N^{1/2} q^{1/6}

which can improve upon the Burgess bound when {y} is small. For instance, if {N = q^{1/2}}, then this bound becomes {\lessapprox y^{1/6} q^{5/12}}; in contrast the Burgess bound only gives {\lessapprox q^{7/16}} for this value of {N} (using the optimal choice {r=2} for {r}), which is inferior for {y < q^{1/8}}.

The hypothesis that {q} be squarefree may be relaxed, but for applications to the Polymath8 project, it is only the squarefree moduli that are relevant.

Proof: If {N \ll q_1} then the claim follows from the trivial bound (1), while for {N \gg q_2} the claim follows from (2). Hence we may assume that

\displaystyle  q_1 < N < q_2.

We use the method of Weyl differencing, the key point being to difference in multiples of {q_1}.

Let {K := \lfloor N/q_1 \rfloor}, thus {K \geq 1}. For any {1 \leq k \leq K}, we have

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \sum_n 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1)

and thus on averaging

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1). \ \ \ \ \ (4)

By the Chinese remainder theorem, we may factor

\displaystyle  \chi(n) = \chi_1(n) \chi_2(n)

where {\chi_1,\chi_2} are primitive characters of conductor {q_1,q_2} respectively. As {\chi_1} is periodic of period {q_1}, we thus have

\displaystyle  \chi(n+kq_1) = \chi_1(n) \chi_2(n+kq_2)

and so we can take {\chi_1} out of the inner summation of the right-hand side of (4) to obtain

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \chi_1(n) \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)

and hence by the triangle inequality

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq \frac{1}{K} \sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|.

Note how the characters on the right-hand side only have period {q_2} rather than {q=q_1 q_2}. This reduction in the period is ultimately the source of the saving over the Pólya-Vinogradov inequality.

Note that the inner sum vanishes unless {n \in [M+1-Kq_1,M+N]}, which is an interval of length {O(N)} by choice of {K}. Thus by Cauchy-Schwarz one has

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \ll

\displaystyle  \frac{N^{1/2}}{K} (\sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|^2)^{1/2}.

We expand the right-hand side as

\displaystyle  \frac{N^{1/2}}{K} |\sum_{1 \leq k,k' \leq K} \sum_n

\displaystyle  1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|^{1/2}.

We first consider the diagonal contribution {k=k'}. In this case we use the trivial bound {O(N)} for the inner summation, and we soon see that the total contribution here is {O( K^{-1/2} N ) = O( N^{1/2}q_1^{1/2} )}.

Now we consider the off-diagonal case; by symmetry we can take {k < k'}. Then the indicator functions {1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1)} restrict {n} to the interval {[M+1-kq_1, M+N-k'q_1]}. On the other hand, as a consequence of the Weil conjectures for curves one can show that

\displaystyle  |\sum_{n \in {\bf Z}/q_2{\bf Z}} \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)} e_{q_2}(an)| \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}

for any {a \in {\bf Z}/q_2{\bf Z}}; indeed one can use the Chinese remainder theorem and the square-free nature of {q_2} to reduce to the case when {q_2} is prime, in which case one can apply (for instance) the original paper of Weil to establish this bound, noting also that {q_1} and {q_2} are coprime since {q} is squarefree. Applying the method of completion of sums (or the Parseval formula), this shows that

\displaystyle  |\sum_n 1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|

\displaystyle  \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}.

Summing in {k,k'} (using Lemma 5 from this previous post) we see that the total contribution to the off-diagonal case is

\displaystyle  \lessapprox \frac{N^{1/2}}{K} ( K^2 q_2^{1/2} )^{1/2}

which simplifies to {\lessapprox N^{1/2} q_2^{1/4}}. The claim follows. \Box

A modification of the above argument (using more complicated versions of the Weil conjectures) allows one to replace the summand {\chi(n)} by more complicated summands such as {\chi(f(n)) e_q(g(n))} for some polynomials or rational functions {f,g} of bounded degree and obeying a suitable non-degeneracy condition (after restricting of course to those {n} for which the arguments {f(n),g(n)} are well-defined). We will not detail this here, but instead turn to the question of estimating slightly longer exponential sums, such as

\displaystyle  \sum_{1 \leq n \leq N} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} )

where {N} should be thought of as a little bit larger than {(d_1d_2)^{1/2}}.

Read the rest of this entry »

A finite group {G=(G,\cdot)} is said to be a Frobenius group if there is a non-trivial subgroup {H} of {G} (known as the Frobenius complement of {G}) such that the conjugates {gHg^{-1}} of {H} are “disjoint as possible” in the sense that {H \cap gHg^{-1} = \{1\}} whenever {g \not \in H}. This gives a decomposition

\displaystyle  G = \bigcup_{gH \in G/H} (gHg^{-1} \backslash \{1\}) \cup K \ \ \ \ \ (1)

where the Frobenius kernel {K} of {G} is defined as the identity element {1} together with all the non-identity elements that are not conjugate to any element of {H}. Taking cardinalities, we conclude that

\displaystyle  |G| = \frac{|G|}{|H|} (|H| - 1) + |K|

and hence

\displaystyle  |H| |K| = |G|. \ \ \ \ \ (2)

A remarkable theorem of Frobenius gives an unexpected amount of structure on {K} and hence on {G}:

Theorem 1 (Frobenius’ theorem) Let {G} be a Frobenius group with Frobenius complement {H} and Frobenius kernel {K}. Then {K} is a normal subgroup of {G}, and hence (by (2) and the disjointness of {H} and {K} outside the identity) {G} is the semidirect product {K \rtimes H} of {H} and {K}.

I discussed Frobenius’ theorem and its proof in this recent blog post. This proof uses the theory of characters on a finite group {G}, in particular relying on the fact that a character on a subgroup {H} can induce a character on {G}, which can then be decomposed into irreducible characters with natural number coefficients. Remarkably, even though a century has passed since Frobenius’ original argument, there is no proof known of this theorem which avoids character theory entirely; there are elementary proofs known when the complement {H} has even order or when {H} is solvable (we review both of these cases below the fold), which by the Feit-Thompson theorem does cover all the cases, but the proof of the Feit-Thompson theorem involves plenty of character theory (and also relies on Theorem 1). (The answers to this MathOverflow question give a good overview of the current state of affairs.)

I have been playing around recently with the problem of finding a character-free proof of Frobenius’ theorem. I didn’t succeed in obtaining a completely elementary proof, but I did find an argument which replaces character theory (which can be viewed as coming from the representation theory of the non-commutative group algebra {{\bf C} G \equiv L^2(G)}) with the Fourier analysis of class functions (i.e. the representation theory of the centre {Z({\bf C} G) \equiv L^2(G)^G} of the group algebra), thus replacing non-commutative representation theory by commutative representation theory. This is not a particularly radical depature from the existing proofs of Frobenius’ theorem, but it did seem to be a new proof which was technically “character-free” (even if it was not all that far from character-based in spirit), so I thought I would record it here.

The main ideas are as follows. The space {L^2(G)^G} of class functions can be viewed as a commutative algebra with respect to the convolution operation {*}; as the regular representation is unitary and faithful, this algebra contains no nilpotent elements. As such, (Gelfand-style) Fourier analysis suggests that one can analyse this algebra through the idempotents: class functions {\phi} such that {\phi*\phi = \phi}. In terms of characters, idempotents are nothing more than sums of the form {\sum_{\chi \in \Sigma} \chi(1) \chi} for various collections {\Sigma} of characters, but we can perform a fair amount of analysis on idempotents directly without recourse to characters. In particular, it turns out that idempotents enjoy some important integrality properties that can be established without invoking characters: for instance, by taking traces one can check that {\phi(1)} is a natural number, and more generally we will show that {{\bf E}_{(a,b) \in S} {\bf E}_{x \in G} \phi( a x b^{-1} x^{-1} )} is a natural number whenever {S} is a subgroup of {G \times G} (see Corollary 4 below). For instance, the quantity

\displaystyle  \hbox{rank}(\phi) := {\bf E}_{a \in G} {\bf E}_{x \in G} \phi(a xa^{-1} x^{-1})

is a natural number which we will call the rank of {\phi} (as it is also the linear rank of the transformation {f \mapsto f*\phi} on {L^2(G)}).

In the case that {G} is a Frobenius group with kernel {K}, the above integrality properties can be used after some elementary manipulations to establish that for any idempotent {\phi}, the quantity

\displaystyle  \frac{1}{|G|} \sum_{a \in K} {\bf E}_{x \in G} \phi( axa^{-1}x^{-1} ) - \frac{1}{|G| |K|} \sum_{a,b \in K} \phi(ab^{-1}) \ \ \ \ \ (3)

is an integer. On the other hand, one can also show by elementary means that this quantity lies between {0} and {\hbox{rank}(\phi)}. These two facts are not strong enough on their own to impose much further structure on {\phi}, unless one restricts attention to minimal idempotents {\phi}. In this case spectral theory (or Gelfand theory, or the fundamental theorem of algebra) tells us that {\phi} has rank one, and then the integrality gap comes into play and forces the quantity (3) to always be either zero or one. This can be used to imply that the convolution action of every minimal idempotent {\phi} either preserves {\frac{|G|}{|K|} 1_K} or annihilates it, which makes {\frac{|G|}{|K|} 1_K} itself an idempotent, which makes {K} normal.

Read the rest of this entry »

Suppose that {G = (G,\cdot)} is a finite group of even order, thus {|G|} is a multiple of two. By Cauchy’s theorem, this implies that {G} contains an involution: an element {g} in {G} of order two. (Indeed, if no such involution existed, then {G} would be partitioned into doubletons {\{g,g^{-1}\}} together with the identity, so that {|G|} would be odd, a contradiction.) Of course, groups of odd order have no involutions {g}, thanks to Lagrange’s theorem (since {G} cannot split into doubletons {\{ h, hg \}}).

The classical Brauer-Fowler theorem asserts that if a group {G} has many involutions, then it must have a large non-trivial subgroup:

Theorem 1 (Brauer-Fowler theorem) Let {G} be a finite group with at least {|G|/n} involutions for some {n > 1}. Then {G} contains a proper subgroup {H} of index at most {n^2}.

This theorem (which is Theorem 2F in the original paper of Brauer and Fowler, who in fact manage to sharpen {n^2} slightly to {n(n+2)/2}) has a number of quick corollaries which are also referred to as “the” Brauer-Fowler theorem. For instance, if {g} is a an involution of a group {G}, and the centraliser {C_G(g) := \{ h \in G: gh = hg\}} has order {n}, then clearly {n \geq 2} (as {C_G(g)} contains {1} and {g}) and the conjugacy class {\{ aga^{-1}: a \in G \}} has order {|G|/n} (since the map {a \mapsto aga^{-1}} has preimages that are cosets of {C_G(g)}). Every conjugate of an involution is again an involution, so by the Brauer-Fowler theorem {G} contains a subgroup of order at least {\max( n, |G|/n^2)}. In particular, we can conclude that every group {G} of even order contains a proper subgroup of order at least {|G|^{1/3}}.

Another corollary is that the size of a simple group of even order can be controlled by the size of a centraliser of one of its involutions:

Corollary 2 (Brauer-Fowler theorem) Let {G} be a finite simple group with an involution {g}, and suppose that {C_G(g)} has order {n}. Then {G} has order at most {(n^2)!}.

Indeed, by the previous discussion {G} has a proper subgroup {H} of index less than {n^2}, which then gives a non-trivial permutation action of {G} on the coset space {G/H}. The kernel of this action is a proper normal subgroup of {G} and is thus trivial, so the action is faithful, and the claim follows.

If one assumes the Feit-Thompson theorem that all groups of odd order are solvable, then Corollary 2 suggests a strategy (first proposed by Brauer himself in 1954) to prove the classification of finite simple groups (CFSG) by induction on the order of the group. Namely, assume for contradiction that the CFSG failed, so that there is a counterexample {G} of minimal order {|G|} to the classification. This is a non-abelian finite simple group; by the Feit-Thompson theorem, it has even order and thus has at least one involution {g}. Take such an involution and consider its centraliser {C_G(g)}; this is a proper subgroup of {G} of some order {n < |G|}. As {G} is a minimal counterexample to the classification, one can in principle describe {C_G(g)} in terms of the CFSG by factoring the group into simple components (via a composition series) and applying the CFSG to each such component. Now, the “only” thing left to do is to verify, for each isomorphism class of {C_G(g)}, that all the possible simple groups {G} that could have this type of group as a centraliser of an involution obey the CFSG; Corollary 2 tells us that for each such isomorphism class for {C_G(g)}, there are only finitely many {G} that could generate this class for one of its centralisers, so this task should be doable in principle for any given isomorphism class for {C_G(g)}. That’s all one needs to do to prove the classification of finite simple groups!

Needless to say, this program turns out to be far more difficult than the above summary suggests, and the actual proof of the CFSG does not quite proceed along these lines. However, a significant portion of the argument is based on a generalisation of this strategy, in which the concept of a centraliser of an involution is replaced by the more general notion of a normaliser of a {p}-group, and one studies not just a single normaliser but rather the entire family of such normalisers and how they interact with each other (and in particular, which normalisers of {p}-groups commute with each other), motivated in part by the theory of Tits buildings for Lie groups which dictates a very specific type of interaction structure between these {p}-groups in the key case when {G} is a (sufficiently high rank) finite simple group of Lie type over a field of characteristic {p}. See the text of Aschbacher, Lyons, Smith, and Solomon for a more detailed description of this strategy.

The Brauer-Fowler theorem can be proven by a nice application of character theory, of the type discussed in this recent blog post, ultimately based on analysing the alternating tensor power of representations; I reproduce a version of this argument (taken from this text of Isaacs) below the fold. (The original argument of Brauer and Fowler is more combinatorial in nature.) However, I wanted to record a variant of the argument that relies not on the fine properties of characters, but on the cruder theory of quasirandomness for groups, the modern study of which was initiated by Gowers, and is discussed for instance in this previous post. It gives the following slightly weaker version of Corollary 2:

Corollary 3 (Weak Brauer-Fowler theorem) Let {G} be a finite simple group with an involution {g}, and suppose that {C_G(g)} has order {n}. Then {G} can be identified with a subgroup of the unitary group {U_{4n^3}({\bf C})}.

One can get an upper bound on {|G|} from this corollary using Jordan’s theorem, but the resulting bound is a bit weaker than that in Corollary 2 (and the best bounds on Jordan’s theorem require the CFSG!).

Proof: Let {A} be the set of all involutions in {G}, then as discussed above {|A| \geq |G|/n}. We may assume that {G} has no non-trivial unitary representation of dimension less than {4n^3} (since such representations are automatically faithful by the simplicity of {G}); thus, in the language of quasirandomness, {G} is {4n^3}-quasirandom, and is also non-abelian. We have the basic convolution estimate

\displaystyle  \|1_A * 1_A * 1_A - \frac{|A|^3}{|G|} \|_{\ell^\infty(G)} \leq (4n^3)^{-1/2} |G|^{1/2} |A|^{3/2}

(see Exercise 10 from this previous blog post). In particular,

\displaystyle  1_A * 1_A * 1_A(0) \geq \frac{|A|^3}{|G|} - (4n^3)^{-1/2} |G|^{1/2} |A|^{3/2} \geq \frac{1}{2n^3} |G|^2

and so there are at least {|G|^2/2n^3} pairs {(g,h) \in A \times A} such that {gh \in A^{-1} = A}, i.e. involutions {g,h} whose product is also an involution. But any such involutions necessarily commute, since

\displaystyle  g (gh) h = g^2 h^2 = 1 = (gh)^2 = g (hg) h.

Thus there are at least {|G|^2/2n^3} pairs {(g,h) \in G \times G} of non-identity elements that commute, so by the pigeonhole principle there is a non-identity {g \in G} whose centraliser {C_G(g)} has order at least {|G|/2n^3}. This centraliser cannot be all of {G} since this would make {g} central which contradicts the non-abelian simple nature of {G}. But then the quasiregular representation of {G} on {G/C_G(g)} has dimension at most {2n^3}, contradicting the quasirandomness. \Box

Read the rest of this entry »

The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.

An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.

However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group {G} with the property that the centraliser {C_G(x) := \{ g \in G: gx=xg \}} of any non-identity element {x \in G} is abelian; equivalently, the commuting relation {x \sim y} (defined as the relation that holds when {x} commutes with {y}, thus {xy=yx}) is an equivalence relation on the non-identity elements {G \backslash \{1\}} of {G}. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the {ax+b} group of invertible affine transformations {x \mapsto ax+b} on a field {F}. A little less obviously, the special linear group {SL_2(F_q)} over a finite field {F_q} is a CA-group when {q} is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in {SL_d(F_q)}, for instance, when {d} is bounded and {q} is large), is typically a maximal torus (because most elements in {SL_d(F_q)} are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have

Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.

Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.

Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group {G} which has a subgroup {H} (called the Frobenius complement) with the property that all the non-trivial conjugates {gHg^{-1}} of {H} for {g \in G \backslash H}, intersect {H} only at the origin. For instance the {ax+b} group is also a Frobenius group (take {H} to be the affine transformations that fix a specified point {x_0 \in F}, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if {G} is a CA-group and {H} is a maximal abelian subgroup of {G}, then any conjugate {gHg^{-1}} of {H} that is not identical to {H} will intersect {H} only at the origin (because {H} and each of its conjugates consist of equivalence classes under the commuting relation {\sim}, together with the identity). So if a maximal abelian subgroup {H} of a CA-group is its own normaliser (thus {N(H) := \{ g \in G: gH=Hg\}} is equal to {H}), then the group is a Frobenius group.

Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:

Theorem 2 (Frobenius’ theorem) Let {G} be a Frobenius group with Frobenius complement {H}. Then there exists a normal subgroup {K} of {G} (called the Frobenius kernel of {G}) such that {G} is the semi-direct product {H \ltimes K} of {H} and {K}.

Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the {ax+b} example (which is a quintessential example of a semi-direct product).

Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of {G}, since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).

Frobenius’ theorem can be reformulated in the following concrete combinatorial form:

Theorem 3 (Frobenius’ theorem, equivalent version) Let {G} be a group of permutations acting transitively on a finite set {X}, with the property that any non-identity permutation in {G} fixes at most one point in {X}. Then the set of permutations in {G} that fix no points in {X}, together with the identity, is closed under composition.

Again, a good example to keep in mind for this theorem is when {G} is the group of affine permutations on a field {F} (i.e. the {ax+b} group for that field), and {X} is the set of points on that field. In that case, the set of permutations in {G} that do not fix any points are the non-trivial translations.

To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in {X}. Conversely, to deduce Theorem 2 from Theorem 3, set {X := G/H = \{ gH: g \in G \}} to be the space of left-cosets of {H}, with the obvious left {G}-action; one easily verifies that this action is faithful, transitive, and each non-identity element {g} of {G} fixes at most one left-coset of {H} (basically because it lies in at most one conjugate of {H}). If we let {K} be the elements of {G} that do not fix any point in {X}, plus the identity, then by Theorem 3 {K} is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of {G}. From construction {K} is the identity plus the complement of all the {|G|/|H|} conjugates of {H}, which are all disjoint except at the identity, so by counting elements we see that

\displaystyle |K| = |G| - \frac{|G|}{|H|}(|H|-1) = |G|/|H|.

As {H} normalises {K} and is disjoint from {K}, we thus see that {KH = H \ltimes K} is all of {G}, giving Theorem 2.

Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)

It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.

Read the rest of this entry »

In these notes we lay out the basic theory of the Fourier transform, which is of course the most fundamental tool in harmonic analysis and also of major importance in related fields (functional analysis, complex analysis, PDE, number theory, additive combinatorics, representation theory, signal processing, etc.). The Fourier transform, in conjunction with the Fourier inversion formula, allows one to take essentially arbitrary (complex-valued) functions on a group {G} (or more generally, a space {X} that {G} acts on, e.g. a homogeneous space {G/H}), and decompose them as a (discrete or continuous) superposition of much more symmetric functions on the domain, such as characters {\chi: G \rightarrow S^1}; the precise superposition is given by Fourier coefficients {\hat f(\xi)}, which take values in some dual object such as the Pontryagin dual {\hat G} of {G}. Characters behave in a very simple manner with respect to translation (indeed, they are eigenfunctions of the translation action), and so the Fourier transform tends to simplify any mathematical problem which enjoys a translation invariance symmetry (or an approximation to such a symmetry), and is somehow “linear” (i.e. it interacts nicely with superpositions). In particular, Fourier analytic methods are particularly useful for studying operations such as convolution {f, g \mapsto f*g} and set-theoretic addition {A, B \mapsto A+B}, or the closely related problem of counting solutions to additive problems such as {x = a_1 + a_2 + a_3} or {x = a_1 - a_2}, where {a_1, a_2, a_3} are constrained to lie in specific sets {A_1, A_2, A_3}. The Fourier transform is also a particularly powerful tool for solving constant-coefficient linear ODE and PDE (because of the translation invariance), and can also approximately solve some variable-coefficient (or slightly non-linear) equations if the coefficients vary smoothly enough and the nonlinear terms are sufficiently tame.

The Fourier transform {\hat f(\xi)} also provides an important new way of looking at a function {f(x)}, as it highlights the distribution of {f} in frequency space (the domain of the frequency variable {\xi}) rather than physical space (the domain of the physical variable {x}). A given property of {f} in the physical domain may be transformed to a rather different-looking property of {\hat f} in the frequency domain. For instance:

  • Smoothness of {f} in the physical domain corresponds to decay of {\hat f} in the Fourier domain, and conversely. (More generally, fine scale properties of {f} tend to manifest themselves as coarse scale properties of {\hat f}, and conversely.)
  • Convolution in the physical domain corresponds to pointwise multiplication in the Fourier domain, and conversely.
  • Constant coefficient differential operators such as {d/dx} in the physical domain corresponds to multiplication by polynomials such as {2\pi i \xi} in the Fourier domain, and conversely.
  • More generally, translation invariant operators in the physical domain correspond to multiplication by symbols in the Fourier domain, and conversely.
  • Rescaling in the physical domain by an invertible linear transformation corresponds to an inverse (adjoint) rescaling in the Fourier domain.
  • Restriction to a subspace (or subgroup) in the physical domain corresponds to projection to the dual quotient space (or quotient group) in the Fourier domain, and conversely.
  • Frequency modulation in the physical domain corresponds to translation in the frequency domain, and conversely.

(We will make these statements more precise below.)

On the other hand, some operations in the physical domain remain essentially unchanged in the Fourier domain. Most importantly, the {L^2} norm (or energy) of a function {f} is the same as that of its Fourier transform, and more generally the inner product {\langle f, g \rangle} of two functions {f} is the same as that of their Fourier transforms. Indeed, the Fourier transform is a unitary operator on {L^2} (a fact which is variously known as the Plancherel theorem or the Parseval identity). This makes it easier to pass back and forth between the physical domain and frequency domain, so that one can combine techniques that are easy to execute in the physical domain with other techniques that are easy to execute in the frequency domain. (In fact, one can combine the physical and frequency domains together into a product domain known as phase space, and there are entire fields of mathematics (e.g. microlocal analysis, geometric quantisation, time-frequency analysis) devoted to performing analysis on these sorts of spaces directly, but this is beyond the scope of this course.)

In these notes, we briefly discuss the general theory of the Fourier transform, but will mainly focus on the two classical domains for Fourier analysis: the torus {{\Bbb T}^d := ({\bf R}/{\bf Z})^d}, and the Euclidean space {{\bf R}^d}. For these domains one has the advantage of being able to perform very explicit algebraic calculations, involving concrete functions such as plane waves {x \mapsto e^{2\pi i x \cdot \xi}} or Gaussians {x \mapsto A^{d/2} e^{-\pi A |x|^2}}.

Read the rest of this entry »

Archives