You are currently browsing the tag archive for the ‘structure’ tag.
Klaus Roth, who made fundamental contributions to analytic number theory, died this Tuesday, aged 90.
I never met or communicated with Roth personally, but was certainly influenced by his work; he wrote relatively few papers, but they tended to have outsized impact. For instance, he was one of the key people (together with Bombieri) to work on simplifying and generalising the large sieve, taking it from the technically formidable original formulation of Linnik and Rényi to the clean and general almost orthogonality principle that we have today (discussed for instance in these lecture notes of mine). The paper of Roth that had the most impact on my own personal work was his three-page paper proving what is now known as Roth’s theorem on arithmetic progressions:
Theorem 1 (Roth’s theorem on arithmetic progressions) Let be a set of natural numbers of positive upper density (thus ). Then contains infinitely many arithmetic progressions of length three (with non-zero of course).
At the heart of Roth’s elegant argument was the following (surprising at the time) dichotomy: if had some moderately large density within some arithmetic progression , either one could use Fourier-analytic methods to detect the presence of an arithmetic progression of length three inside , or else one could locate a long subprogression of on which had increased density. Iterating this dichotomy by an argument now known as the density increment argument, one eventually obtains Roth’s theorem, no matter which side of the dichotomy actually holds. This argument (and the many descendants of it), based on various “dichotomies between structure and randomness”, became essential in many other results of this type, most famously perhaps in Szemerédi’s proof of his celebrated theorem on arithmetic progressions that generalised Roth’s theorem to progressions of arbitrary length. More recently, my recent work on the Chowla and Elliott conjectures that was a crucial component of the solution of the Erdös discrepancy problem, relies on an entropy decrement argument which was directly inspired by the density increment argument of Roth.
The Erdös discrepancy problem also is connected with another well known theorem of Roth:
Theorem 2 (Roth’s discrepancy theorem for arithmetic progressions) Let be a sequence in . Then there exists an arithmetic progression in with positive such that
for an absolute constant .
In fact, Roth proved a stronger estimate regarding mean square discrepancy, which I am not writing down here; as with the Roth theorem in arithmetic progressions, his proof was short and Fourier-analytic in nature (although non-Fourier-analytic proofs have since been found, for instance the semidefinite programming proof of Lovasz). The exponent is known to be sharp (a result of Matousek and Spencer).
As a particular corollary of the above theorem, for an infinite sequence of signs, the sums are unbounded in . The Erdös discrepancy problem asks whether the same statement holds when is restricted to be zero. (Roth also established discrepancy theorems for other sets, such as rectangles, which will not be discussed here.)
Finally, one has to mention Roth’s most famous result, cited for instance in his Fields medal citation:
Theorem 3 (Roth’s theorem on Diophantine approximation) Let be an irrational algebraic number. Then for any there is a quantity such that
From the Dirichlet approximation theorem (or from the theory of continued fractions) we know that the exponent in the denominator cannot be reduced to or below. A classical and easy theorem of Liouville gives the claim with the exponent replaced by the degree of the algebraic number ; work of Thue and Siegel reduced this exponent, but Roth was the one who obtained the near-optimal result. An important point is that the constant is ineffective – it is a major open problem in Diophantine approximation to produce any bound significantly stronger than Liouville’s theorem with effective constants. This is because the proof of Roth’s theorem does not exclude any single rational from being close to , but instead very ingeniously shows that one cannot have two different rationals , that are unusually close to , even when the denominators are very different in size. (I refer to this sort of argument as a “dueling conspiracies” argument; they are strangely prevalent throughout analytic number theory.)
We will usually omit the subscript from the Lie bracket when this will not cause ambiguity. A homomorphism between two Lie algebras is a linear map that respects the Lie bracket, thus for all . As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted ; Lie algebras of positive dimension will be called non-trivial.
Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).
A particular example of a Lie algebra is the general linear Lie algebra of linear transformations on a finite-dimensional complex vector space (or vector space for short) , with the commutator Lie bracket ; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of for some vector space , and similarly define a representation of a Lie algebra to be a homomorphism into a concrete Lie algebra . It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.
Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra is to factor it into two simpler algebras via a short exact sequence
thus one has an injective homomorphism from to and a surjective homomorphism from to such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify with the vector space equipped with a Lie bracket of the form
for some bilinear maps and that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding can be reduced to that of understanding that of its factors . To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation
although one should caution that for given and , there can be multiple non-isomorphic that can form a short exact sequence with , so that is not a uniquely defined combination of and ; one could emphasise this by writing instead of , though we will not do so here. We will refer to as an extension of by , and read the notation (5) as “ is -by-“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation is not commutative, and it is only partly associative: every Lie algebra of the form is also of the form , but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of as a (twisted) “sum” of and rather than a “product”; for instance, we have and , and also .
Special examples of extensions of by include the direct sum (or direct product) (also denoted ), which is given by the construction (4) with and both vanishing, and the split extension (or semidirect product) (also denoted ), which is given by the construction (4) with vanishing and the bilinear map taking the form
for some representation of in the concrete Lie algebra of derivations of , that is to say the algebra of linear maps that obey the Leibniz rule
for all . (The derivation algebra of a Lie algebra is analogous to the automorphism group of a Lie group , with the two concepts being intertwined by the tangent space functor from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)
There are two general ways to factor a Lie algebra as an extension of a smaller Lie algebra by another smaller Lie algebra . One is to locate a Lie algebra ideal (or ideal for short) in , thus , where denotes the Lie algebra generated by , and then take to be the quotient space in the usual manner; one can check that , are also Lie algebras and that we do indeed have a short exact sequence
Conversely, whenever one has a factorisation , one can identify with an ideal in , and with the quotient of by .
The other general way to obtain such a factorisation is is to start with a homomorphism of into another Lie algebra , take to be the image of , and to be the kernel . Again, it is easy to see that this does indeed create a short exact sequence:
Conversely, whenever one has a factorisation , one can identify with the image of under some homomorphism, and with the kernel of that homomorphism. Note that if a representation is faithful (i.e. injective), then the kernel is trivial and is isomorphic to .
Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras , in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra . Conversely, by using an arbitrary linear basis of , we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of .
Now consider a Lie algebra that is not necessarily abelian. We then form the derived algebra ; this algebra is trivial if and only if is abelian. It is easy to see that is an ideal whenever are ideals, so in particular the derived algebra is an ideal and we thus have the short exact sequence
The algebra is the maximal abelian quotient of , and is known as the abelianisation of . If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than . From this, it is natural to associate two series to any Lie algebra , the lower central series
and the derived series
By induction we see that these are both decreasing series of ideals of , with the derived series being slightly smaller ( for all ). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group , which can be identified with the Lie algebra of complex matrices, the subalgebra of strictly upper triangular matrices is nilpotent (but not abelian for ), while the subalgebra of upper triangular matrices is solvable (but not nilpotent for ). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.
From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus
for some abelian . Similarly, a Lie algebra is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions in the above factorisation, is central in , where we say that is central in if ). We also see that an extension is solvable if and only of both factors are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.
For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra has an adjoint representation , where for each , is the linear map ; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that for all ). The kernel of this representation is the center , which the maximal central subalgebra of . We thus have the short exact sequence
For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra is simple if it is non-abelian and has no ideals other than and ; thus simple Lie algebras cannot be factored into strictly smaller algebras . In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:
- (i) does not contain any non-trivial solvable ideal.
- (ii) does not contain any non-trivial abelian ideal.
- (iii) The Killing form , defined as the bilinear form , is non-degenerate on .
- (iv) is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.
We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 32), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)
If and are solvable ideals of a Lie algebra , then it is not difficult to see that the vector sum is also a solvable ideal (because on quotienting by we see that the derived series of must eventually fall inside , and thence must eventually become trivial by the solvability of ). As our Lie algebras are finite dimensional, we conclude that has a unique maximal solvable ideal, known as the radical of . The quotient is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition
expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra by a solvable algebra (and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension
although we will not need or prove this result here.
In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra . In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over , are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:
- for some .
- for some .
- for some .
- for some .
- , or .
(The precise definition of the classical Lie algebras and the exceptional Lie algebras will be recalled later.)
(One can extend the families of classical Lie algebras a little bit to smaller values of , but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)
This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals ), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.
This week I am in Bremen, where the 50th International Mathematical Olympiad is being held. A number of former Olympians (Béla Bollobás, Tim Gowers, Laci Lovasz, Stas Smirnov, Jean-Christophe Yoccoz, and myself) were invited to give a short talk (20 minutes in length) at the celebratory event for this anniversary. I chose to talk on a topic I have spoken about several times before, on “Structure and randomness in the prime numbers“. Given the time constraints, there was a limit as to how much substance I could put into the talk; but I try to describe, in very general terms, what we know about the primes, and what we suspect to be true, but cannot yet establish. As I have mentioned in previous talks, the key problem is that we suspect the distribution of the primes to obey no significant patterns (other than “local” structure, such as having a strong tendency to be odd (which is local information at the 2 place), or obeying the prime number theorem (which is local information at the infinity place)), but we still do not have fully satisfactory tools for establishing the absence of a pattern. (This is in contrast with many types of Olympiad problems, where the key to solving a problem often lies in discovering the right pattern or structure in the problem to exploit.)
The PDF of the talk is here; I decided to try out the Beamer LaTeX package for a change.
One of the most important topological concepts in analysis is that of compactness (as discussed for instance in my Companion article on this topic). There are various flavours of this concept, but let us focus on sequential compactness: a subset E of a topological space X is sequentially compact if every sequence in E has a convergent subsequence whose limit is also in E. This property allows one to do many things with the set E. For instance, it allows one to maximise a functional on E:
Proposition 1. (Existence of extremisers) Let E be a non-empty sequentially compact subset of a topological space X, and let be a continuous function. Then the supremum is attained at at least one point , thus for all . (In particular, this supremum is finite.) Similarly for the infimum.
Proof. Let be the supremum . By the definition of supremum (and the axiom of (countable) choice), one can find a sequence in E such that . By compactness, we can refine this sequence to a subsequence (which, by abuse of notation, we shall continue to call ) such that converges to a limit x in E. Since we still have , and f is continuous at x, we conclude that f(x)=L, and the claim for the supremum follows. The claim for the infimum is similar.
Remark 1. An inspection of the argument shows that one can relax the continuity hypothesis on F somewhat: to attain the supremum, it suffices that F be upper semicontinuous, and to attain the infimum, it suffices that F be lower semicontinuous.
We thus see that sequential compactness is useful, among other things, for ensuring the existence of extremisers. In finite-dimensional spaces (such as vector spaces), compact sets are plentiful; indeed, the Heine-Borel theorem asserts that every closed and bounded set is compact. However, once one moves to infinite-dimensional spaces, such as function spaces, then the Heine-Borel theorem fails quite dramatically; most of the closed and bounded sets one encounters in a topological vector space are non-compact, if one insists on using a reasonably “strong” topology. This causes a difficulty in (among other things) calculus of variations, which is often concerned to finding extremisers to a functional on a subset E of an infinite-dimensional function space X.
In recent decades, mathematicians have found a number of ways to get around this difficulty. One of them is to weaken the topology to recover compactness, taking advantage of such results as the Banach-Alaoglu theorem (or its sequential counterpart). Of course, there is a tradeoff: weakening the topology makes compactness easier to attain, but makes the continuity of F harder to establish. Nevertheless, if F enjoys enough “smoothing” or “cancellation” properties, one can hope to obtain continuity in the weak topology, allowing one to do things such as locate extremisers. (The phenomenon that cancellation can lead to continuity in the weak topology is sometimes referred to as compensated compactness.)
Another option is to abandon trying to make all sequences have convergent subsequences, and settle just for extremising sequences to have convergent subsequences, as this would still be enough to retain Theorem 1. Pursuing this line of thought leads to the Palais-Smale condition, which is a substitute for compactness in some calculus of variations situations.
But in many situations, one cannot weaken the topology to the point where the domain E becomes compact, without destroying the continuity (or semi-continuity) of F, though one can often at least find an intermediate topology (or metric) in which F is continuous, but for which E is still not quite compact. Thus one can find sequences in E which do not have any subsequences that converge to a constant element , even in this intermediate metric. (As we shall see shortly, one major cause of this failure of compactness is the existence of a non-trivial action of a non-compact group G on E; such a group action can cause compensated compactness or the Palais-Smale condition to fail also.) Because of this, it is a priori conceivable that a continuous function F need not attain its supremum or infimum.
Nevertheless, even though a sequence does not have any subsequences that converge to a constant x, it may have a subsequence (which we also call ) which converges to some non-constant sequence (in the sense that the distance between the subsequence and the new sequence in a this intermediate metric), where the approximating sequence is of a very structured form (e.g. “concentrating” to a point, or “travelling” off to infinity, or a superposition of several concentrating or travelling profiles of this form). This weaker form of compactness, in which superpositions of a certain type of profile completely describe all the failures (or defects) of compactness, is known as concentration compactness, and the decomposition of the subsequence is known as the profile decomposition. In many applications, it is a sufficiently good substitute for compactness that one can still do things like locate extremisers for functionals F – though one often has to make some additional assumptions of F to compensate for the more complicated nature of the compactness. This phenomenon was systematically studied by P.L. Lions in the 80s, and found great application in calculus of variations and nonlinear elliptic PDE. More recently, concentration compactness has been a crucial and powerful tool in the non-perturbative analysis of nonlinear dispersive PDE, in particular being used to locate “minimal energy blowup solutions” or “minimal mass blowup solutions” for such a PDE (analogously to how one can use calculus of variations to find minimal energy solutions to a nonlinear elliptic equation); see for instance this recent survey by Killip and Visan.
In typical applications, the concentration compactness phenomenon is exploited in moderately sophisticated function spaces (such as Sobolev spaces or Strichartz spaces), with the failure of traditional compactness being connected to a moderately complicated group G of symmetries (e.g. the group generated by translations and dilations). Because of this, concentration compactness can appear to be a rather complicated and technical concept when it is first encountered. In this note, I would like to illustrate concentration compactness in a simple toy setting, namely in the space of absolutely summable sequences, with the uniform () metric playing the role of the intermediate metric, and the translation group playing the role of the symmetry group G. This toy setting is significantly simpler than any model that one would actually use in practice [for instance, in most applications X is a Hilbert space], but hopefully it serves to illuminate this useful concept in a less technical fashion.
The last two lectures of this course will be on Ratner’s theorems on equidistribution of orbits on homogeneous spaces. Due to lack of time, I will not be able to cover all the material here that I had originally planned; in particular, for an introduction to this family of results, and its connections with number theory, I will have to refer readers to my previous blog post on these theorems. In this course, I will discuss two special cases of Ratner-type theorems. In this lecture, I will talk about Ratner-type theorems for discrete actions (of the integers on nilmanifolds; this case is much simpler than the general case, because there is a simple criterion in the nilmanifold case to test whether any given orbit is equidistributed or not. Ben Green and I had need recently to develop quantitative versions of such theorems for a number-theoretic application. In the next and final lecture of this course, I will discuss Ratner-type theorems for actions of , which is simpler in a different way (due to the semisimplicity of , and lack of compact factors).
In this lecture – the final one on general measure-preserving dynamics – we put together the results from the past few lectures to establish the Furstenberg-Zimmer structure theorem for measure-preserving systems, and then use this to finish the proof of the Furstenberg recurrence theorem.
In Lecture 11, we studied compact measure-preserving systems – those systems in which every function was almost periodic, which meant that their orbit was precompact in the topology. Among other things, we were able to easily establish the Furstenberg recurrence theorem (Theorem 1 from Lecture 11) for such systems.
In this lecture, we generalise these results to a “relative” or “conditional” setting, in which we study systems which are compact relative to some factor of . Such systems are to compact systems as isometric extensions are to isometric systems in topological dynamics. The main result we establish here is that the Furstenberg recurrence theorem holds for such compact extensions whenever the theorem holds for the base. The proof is essentially the same as in the compact case; the main new trick is to not to work in the Hilbert spaces over the complex numbers, but rather in the Hilbert module over the (commutative) von Neumann algebra . (Modules are to rings as vector spaces are to fields.) Because of the compact nature of the extension, it turns out that results from topological dynamics (and in particular, van der Waerden’s theorem) can be exploited to good effect in this argument.
[Note: this operator-algebraic approach is not the only way to understand these extensions; one can also proceed by disintegrating into fibre measures for almost every and working fibre by fibre. We will discuss the connection between the two approaches below.]
This Thursday I was at the University of Sydney, Australia, giving a public lecture on a favourite topic of mine, “Structure and randomness in the prime numbers“. My slides here are a merge between my slides for a Royal Society meeting and the slides I gave for the UCLA Science Colloquium; now that I figured out to use Powerpoint a little bit better, I was able to make the latter a bit more colourful (and the former less abridged).
In our final lecture on topological dynamics, we discuss a remarkable theorem of Furstenberg that classifies a major type of topological dynamical system – distal systems – in terms of highly structured (from an algebraic point of view) systems, namely towers of isometric extensions. This theorem is also a model for an important analogous result in ergodic theory, the Furstenberg-Zimmer structure theorem, which we will turn to in a few lectures. We will not be able to prove Furstenberg’s structure theorem for distal systems here in full, but we hope to illustrate some of the key points and ideas.