You are currently browsing the tag archive for the ‘structure’ tag.
Let be a measure-preserving system – a probability space
equipped with a measure-preserving translation
(which for simplicity of discussion we shall assume to be invertible). We will informally think of two points
in this space as being “close” if
for some
that is not too large; this allows one to distinguish between “local” structure at a point
(in which one only looks at nearby points
for moderately large
) and “global” structure (in which one looks at the entire space
). The local/global distinction is also known as the time-averaged/space-averaged distinction in ergodic theory.
A measure-preserving system is said to be ergodic if all the invariant sets are either zero measure or full measure. An equivalent form of this statement is that any measurable function which is locally essentially constant in the sense that
for
-almost every
, is necessarily globally essentially constant in the sense that there is a constant
such that
for
-almost every
. A basic consequence of ergodicity is the mean ergodic theorem: if
, then the averages
converge in
norm to the mean
. (The mean ergodic theorem also applies to other
spaces with
, though it is usually proven first in the Hilbert space
.) Informally: in ergodic systems, time averages are asymptotically equal to space averages. Specialising to the case of indicator functions, this implies in particular that
converges to
for any measurable set
.
In this short note I would like to use the mean ergodic theorem to show that ergodic systems also have the property that “somewhat locally constant” functions are necessarily “somewhat globally constant”; this is not a deep observation, and probably already in the literature, but I found it a cute statement that I had not previously seen. More precisely:
Corollary 1 Let
be an ergodic measure-preserving system, and let
be measurable. Suppose that
for some
. Then there exists a constant
such that
for
in a set of measure at least
.
Informally: if is locally constant on pairs
at least
of the time, then
is globally constant at least
of the time. Of course the claim fails if the ergodicity hypothesis is dropped, as one can simply take
to be an invariant function that is not essentially constant, such as the indicator function of an invariant set of intermediate measure. This corollary can be viewed as a manifestation of the general principle that ergodic systems have the same “global” (or “space-averaged”) behaviour as “local” (or “time-averaged”) behaviour, in contrast to non-ergodic systems in which local properties do not automatically transfer over to their global counterparts.
Proof: By composing with (say) the arctangent function, we may assume without loss of generality that
is bounded. Let
, and partition
as
, where
is the level set
For each , only finitely many of the
are non-empty. By (1), one has
Using the ergodic theorem, we conclude that
On the other hand, . Thus there exists
such that
, thus
By the Bolzano-Weierstrass theorem, we may pass to a subsequence where converges to a limit
, then we have
for infinitely many , and hence
The claim follows.
Let ,
be additive groups (i.e., groups with an abelian addition group law). A map
is a homomorphism if one has
for all . A map
is an affine homomorphism if one has
for all additive quadruples in
, by which we mean that
and
. The two notions are closely related; it is easy to verify that
is an affine homomorphism if and only if
is the sum of a homomorphism and a constant.
Now suppose that also has a translation-invariant metric
. A map
is said to be a quasimorphism if one has
for all , where
denotes a quantity at a bounded distance from the origin. Similarly,
is an affine quasimorphism if
for all additive quadruples in
. Again, one can check that
is an affine quasimorphism if and only if it is the sum of a quasimorphism and a constant (with the implied constant of the quasimorphism controlled by the implied constant of the affine quasimorphism). (Since every constant is itself a quasimorphism, it is in fact the case that affine quasimorphisms are quasimorphisms, but now the implied constant in the latter is not controlled by the implied constant of the former.)
“Trivial” examples of quasimorphisms include the sum of a homomorphism and a bounded function. Are there others? In some cases, the answer is no. For instance, suppose we have a quasimorphism . Iterating (2), we see that
for any integer
and natural number
, which we can rewrite as
for non-zero
. Also,
is Lipschitz. Sending
, we can verify that
is a Cauchy sequence as
and thus tends to some limit
; we have
for
, hence
for positive
, and then one can use (2) one last time to obtain
for all
. Thus
is the sum of the homomorphism
and a bounded sequence.
In general, one can phrase this problem in the language of group cohomology (discussed in this previous post). Call a map a
-cocycle. A
-cocycle is a map
obeying the identity
for all . Given a
-cocycle
, one can form its derivative
by the formula
Such functions are called -coboundaries. It is easy to see that the abelian group of
-coboundaries is a subgroup of the abelian group of
-cocycles. The quotient of these two groups is the first group cohomology of
with coefficients in
, and is denoted
.
If a -cocycle is bounded then its derivative is a bounded
-coboundary. The quotient of the group of bounded
-cocycles by the derivatives of bounded
-cocycles is called the bounded first group cohomology of
with coefficients in
, and is denoted
. There is an obvious homomorphism
from
to
, formed by taking a coset of the space of derivatives of bounded
-cocycles, and enlarging it to a coset of the space of
-coboundaries. By chasing all the definitions, we see that all quasimorphism from
to
are the sum of a homomorphism and a bounded function if and only if this homomorphism
is injective; in fact the quotient of the space of quasimorphisms by the sum of homomorphisms and bounded functions is isomorphic to the kernel of
.
In additive combinatorics, one is often working with functions which only have additive structure a fraction of the time, thus for instance (1) or (3) might only hold “ of the time”. This makes it somewhat difficult to directly interpret the situation in terms of group cohomology. However, thanks to tools such as the Balog-Szemerédi-Gowers lemma, one can upgrade this sort of
-structure to
-structure – at the cost of restricting the domain to a smaller set. Here I record one such instance of this phenomenon, thus giving a tentative link between additive combinatorics and group cohomology. (I thank Yuval Wigderson for suggesting the problem of locating such a link.)
Theorem 1 Let
,
be additive groups with
, let
be a subset of
, let
, and let
be a function such that
for
additive quadruples
in
. Then there exists a subset
of
containing
with
, a subset
of
with
, and a function
such that
for all
(thus, the derivative
takes values in
on
), and such that for each
, one has
for
values of
.
Presumably the constants and
can be improved further, but we have not attempted to optimise these constants. We chose
as the domain on which one has a bounded derivative, as one can use the Bogulybov lemma (see e.g, Proposition 4.39 of my book with Van Vu) to find a large Bohr set inside
. In applications, the set
need not have bounded size, or even bounded doubling; for instance, in the inverse
theory over a small finite fields
, one would be interested in the situation where
is the group of
matrices with coefficients in
(for some large
, and
being the subset consisting of those matrices of rank bounded by some bound
.
Proof: By hypothesis, there are triples
such that
and
Thus, there is a set with
such that for all
, one has (6) for
pairs
with
; in particular, there exists
such that (6) holds for
values of
. Setting
, we conclude that for each
, one has
for values of
.
Consider the bipartite graph whose vertex sets are two copies of , and
and
connected by a (directed) edge if
and (7) holds. Then this graph has
edges. Applying (a slight modification of) the Balog-Szemerédi-Gowers theorem (for instance by modifying the proof of Corollary 5.19 of my book with Van Vu), we can then find a subset
of
with
with the property that for any
, there exist
triples
such that the edges
all lie in this bipartite graph. This implies that, for all
, there exist
septuples
obeying the constraints
and for
. These constraints imply in particular that
Also observe that
Thus, if and
are such that
, we see that
for octuples
in the hyperplane
By the pigeonhole principle, this implies that for any fixed , there can be at most
sets of the form
with
,
that are pairwise disjoint. Using a greedy algorithm, we conclude that there is a set
of cardinality
, such that each set
with
,
intersects
for some
, or in other words that
whenever . In particular,
This implies that there exists a subset of
with
, and an element
for each
, such that
for all . Note we may assume without loss of generality that
and
.
By construction of , and permuting labels, we can find
16-tuples
such that
and
for . We sum this to obtain
and hence by (8)
where . Since
we see that there are only possible values of
. By the pigeonhole principle, we conclude that at most
of the sets
can be disjoint. Arguing as before, we conclude that there exists a set
of cardinality
such that
whenever (10) holds.
For any , write
arbitrarily as
for some
(with
if
, and
if
) and then set
Then from (11) we have (4). For we have
, and (5) then follows from (9).
Klaus Roth, who made fundamental contributions to analytic number theory, died this Tuesday, aged 90.
I never met or communicated with Roth personally, but was certainly influenced by his work; he wrote relatively few papers, but they tended to have outsized impact. For instance, he was one of the key people (together with Bombieri) to work on simplifying and generalising the large sieve, taking it from the technically formidable original formulation of Linnik and Rényi to the clean and general almost orthogonality principle that we have today (discussed for instance in these lecture notes of mine). The paper of Roth that had the most impact on my own personal work was his three-page paper proving what is now known as Roth’s theorem on arithmetic progressions:
Theorem 1 (Roth’s theorem on arithmetic progressions) Let
be a set of natural numbers of positive upper density (thus
). Then
contains infinitely many arithmetic progressions
of length three (with
non-zero of course).
At the heart of Roth’s elegant argument was the following (surprising at the time) dichotomy: if had some moderately large density within some arithmetic progression
, either one could use Fourier-analytic methods to detect the presence of an arithmetic progression of length three inside
, or else one could locate a long subprogression
of
on which
had increased density. Iterating this dichotomy by an argument now known as the density increment argument, one eventually obtains Roth’s theorem, no matter which side of the dichotomy actually holds. This argument (and the many descendants of it), based on various “dichotomies between structure and randomness”, became essential in many other results of this type, most famously perhaps in Szemerédi’s proof of his celebrated theorem on arithmetic progressions that generalised Roth’s theorem to progressions of arbitrary length. More recently, my recent work on the Chowla and Elliott conjectures that was a crucial component of the solution of the Erdös discrepancy problem, relies on an entropy decrement argument which was directly inspired by the density increment argument of Roth.
The Erdös discrepancy problem also is connected with another well known theorem of Roth:
Theorem 2 (Roth’s discrepancy theorem for arithmetic progressions) Let
be a sequence in
. Then there exists an arithmetic progression
in
with
positive such that
for an absolute constant
.
In fact, Roth proved a stronger estimate regarding mean square discrepancy, which I am not writing down here; as with the Roth theorem in arithmetic progressions, his proof was short and Fourier-analytic in nature (although non-Fourier-analytic proofs have since been found, for instance the semidefinite programming proof of Lovasz). The exponent is known to be sharp (a result of Matousek and Spencer).
As a particular corollary of the above theorem, for an infinite sequence of signs, the sums
are unbounded in
. The Erdös discrepancy problem asks whether the same statement holds when
is restricted to be zero. (Roth also established discrepancy theorems for other sets, such as rectangles, which will not be discussed here.)
Finally, one has to mention Roth’s most famous result, cited for instance in his Fields medal citation:
Theorem 3 (Roth’s theorem on Diophantine approximation) Let
be an irrational algebraic number. Then for any
there is a quantity
such that
From the Dirichlet approximation theorem (or from the theory of continued fractions) we know that the exponent in the denominator cannot be reduced to
or below. A classical and easy theorem of Liouville gives the claim with the exponent
replaced by the degree of the algebraic number
; work of Thue and Siegel reduced this exponent, but Roth was the one who obtained the near-optimal result. An important point is that the constant
is ineffective – it is a major open problem in Diophantine approximation to produce any bound significantly stronger than Liouville’s theorem with effective constants. This is because the proof of Roth’s theorem does not exclude any single rational
from being close to
, but instead very ingeniously shows that one cannot have two different rationals
,
that are unusually close to
, even when the denominators
are very different in size. (I refer to this sort of argument as a “dueling conspiracies” argument; they are strangely prevalent throughout analytic number theory.)
An abstract finite-dimensional complex Lie algebra, or Lie algebra for short, is a finite-dimensional complex vector space together with an anti-symmetric bilinear form
that obeys the Jacobi identity
for all ; by anti-symmetry one can also rewrite the Jacobi identity as
We will usually omit the subscript from the Lie bracket when this will not cause ambiguity. A homomorphism
between two Lie algebras
is a linear map that respects the Lie bracket, thus
for all
. As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted
; Lie algebras of positive dimension will be called non-trivial.
Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).
A particular example of a Lie algebra is the general linear Lie algebra of linear transformations
on a finite-dimensional complex vector space (or vector space for short)
, with the commutator Lie bracket
; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of
for some vector space
, and similarly define a representation of a Lie algebra
to be a homomorphism
into a concrete Lie algebra
. It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.
Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra is to factor it into two simpler algebras
via a short exact sequence
thus one has an injective homomorphism from to
and a surjective homomorphism from
to
such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify
with the vector space
equipped with a Lie bracket of the form
for some bilinear maps and
that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps
are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding
can be reduced to that of understanding that of its factors
. To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation
although one should caution that for given and
, there can be multiple non-isomorphic
that can form a short exact sequence with
, so that
is not a uniquely defined combination of
and
; one could emphasise this by writing
instead of
, though we will not do so here. We will refer to
as an extension of
by
, and read the notation (5) as “
is
-by-
“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation
is not commutative, and it is only partly associative: every Lie algebra of the form
is also of the form
, but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of
as a (twisted) “sum” of
and
rather than a “product”; for instance, we have
and
, and also
.
Special examples of extensions of
by
include the direct sum (or direct product)
(also denoted
), which is given by the construction (4) with
and
both vanishing, and the split extension (or semidirect product)
(also denoted
), which is given by the construction (4) with
vanishing and the bilinear map
taking the form
for some representation of
in the concrete Lie algebra of derivations
of
, that is to say the algebra of linear maps
that obey the Leibniz rule
for all . (The derivation algebra
of a Lie algebra
is analogous to the automorphism group
of a Lie group
, with the two concepts being intertwined by the tangent space functor
from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)
There are two general ways to factor a Lie algebra as an extension
of a smaller Lie algebra
by another smaller Lie algebra
. One is to locate a Lie algebra ideal (or ideal for short)
in
, thus
, where
denotes the Lie algebra generated by
, and then take
to be the quotient space
in the usual manner; one can check that
,
are also Lie algebras and that we do indeed have a short exact sequence
Conversely, whenever one has a factorisation , one can identify
with an ideal in
, and
with the quotient of
by
.
The other general way to obtain such a factorisation is is to start with a homomorphism of
into another Lie algebra
, take
to be the image
of
, and
to be the kernel
. Again, it is easy to see that this does indeed create a short exact sequence:
Conversely, whenever one has a factorisation , one can identify
with the image of
under some homomorphism, and
with the kernel of that homomorphism. Note that if a representation
is faithful (i.e. injective), then the kernel is trivial and
is isomorphic to
.
Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras , in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra
. Conversely, by using an arbitrary linear basis of
, we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of
.
Now consider a Lie algebra that is not necessarily abelian. We then form the derived algebra
; this algebra is trivial if and only if
is abelian. It is easy to see that
is an ideal whenever
are ideals, so in particular the derived algebra
is an ideal and we thus have the short exact sequence
The algebra is the maximal abelian quotient of
, and is known as the abelianisation of
. If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than
. From this, it is natural to associate two series to any Lie algebra
, the lower central series
and the derived series
By induction we see that these are both decreasing series of ideals of , with the derived series being slightly smaller (
for all
). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group
, which can be identified with the Lie algebra of
complex matrices, the subalgebra
of strictly upper triangular matrices is nilpotent (but not abelian for
), while the subalgebra
of upper triangular matrices is solvable (but not nilpotent for
). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.
From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus
for some abelian . Similarly, a Lie algebra
is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions
in the above factorisation,
is central in
, where we say that
is central in
if
). We also see that an extension
is solvable if and only of both factors
are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.
For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra has an adjoint representation
, where for each
,
is the linear map
; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that
for all
). The kernel of this representation is the center
, which the maximal central subalgebra of
. We thus have the short exact sequence
which, among other things, shows that every abstract Lie algebra is a central extension of a concrete Lie algebra (which can serve as a cheap substitute for Ado’s theorem mentioned earlier).
For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra is simple if it is non-abelian and has no ideals other than
and
; thus simple Lie algebras cannot be factored
into strictly smaller algebras
. In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:
Theorem 1 (Equivalent definitions of semisimplicity) Let
be a Lie algebra. Then the following are equivalent:
- (i)
does not contain any non-trivial solvable ideal.
- (ii)
does not contain any non-trivial abelian ideal.
- (iii) The Killing form
, defined as the bilinear form
, is non-degenerate on
.
- (iv)
is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.
We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 44), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)
If and
are solvable ideals of a Lie algebra
, then it is not difficult to see that the vector sum
is also a solvable ideal (because on quotienting by
we see that the derived series of
must eventually fall inside
, and thence must eventually become trivial by the solvability of
). As our Lie algebras are finite dimensional, we conclude that
has a unique maximal solvable ideal, known as the radical
of
. The quotient
is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition
expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra by a solvable algebra
(and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension
although we will not need or prove this result here.
In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra . In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over
, are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:
Theorem 2 (Classification of simple Lie algebras) Up to isomorphism, every simple Lie algebra is of one of the following forms:
for some
.
for some
.
for some
.
for some
.
, or
.
.
.
(The precise definition of the classical Lie algebras
and the exceptional Lie algebras
will be recalled later.)
(One can extend the families of classical Lie algebras a little bit to smaller values of
, but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)
This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals ), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.
This week I am in Bremen, where the 50th International Mathematical Olympiad is being held. A number of former Olympians (Béla Bollobás, Tim Gowers, Laci Lovasz, Stas Smirnov, Jean-Christophe Yoccoz, and myself) were invited to give a short talk (20 minutes in length) at the celebratory event for this anniversary. I chose to talk on a topic I have spoken about several times before, on “Structure and randomness in the prime numbers“. Given the time constraints, there was a limit as to how much substance I could put into the talk; but I try to describe, in very general terms, what we know about the primes, and what we suspect to be true, but cannot yet establish. As I have mentioned in previous talks, the key problem is that we suspect the distribution of the primes to obey no significant patterns (other than “local” structure, such as having a strong tendency to be odd (which is local information at the 2 place), or obeying the prime number theorem (which is local information at the infinity place)), but we still do not have fully satisfactory tools for establishing the absence of a pattern. (This is in contrast with many types of Olympiad problems, where the key to solving a problem often lies in discovering the right pattern or structure in the problem to exploit.)
The PDF of the talk is here; I decided to try out the Beamer LaTeX package for a change.
One of the most important topological concepts in analysis is that of compactness (as discussed for instance in my Companion article on this topic). There are various flavours of this concept, but let us focus on sequential compactness: a subset E of a topological space X is sequentially compact if every sequence in E has a convergent subsequence whose limit is also in E. This property allows one to do many things with the set E. For instance, it allows one to maximise a functional on E:
Proposition 1. (Existence of extremisers) Let E be a non-empty sequentially compact subset of a topological space X, and let
be a continuous function. Then the supremum
is attained at at least one point
, thus
for all
. (In particular, this supremum is finite.) Similarly for the infimum.
Proof. Let be the supremum
. By the definition of supremum (and the axiom of (countable) choice), one can find a sequence
in E such that
. By compactness, we can refine this sequence to a subsequence (which, by abuse of notation, we shall continue to call
) such that
converges to a limit x in E. Since we still have
, and f is continuous at x, we conclude that f(x)=L, and the claim for the supremum follows. The claim for the infimum is similar.
Remark 1. An inspection of the argument shows that one can relax the continuity hypothesis on F somewhat: to attain the supremum, it suffices that F be upper semicontinuous, and to attain the infimum, it suffices that F be lower semicontinuous.
We thus see that sequential compactness is useful, among other things, for ensuring the existence of extremisers. In finite-dimensional spaces (such as vector spaces), compact sets are plentiful; indeed, the Heine-Borel theorem asserts that every closed and bounded set is compact. However, once one moves to infinite-dimensional spaces, such as function spaces, then the Heine-Borel theorem fails quite dramatically; most of the closed and bounded sets one encounters in a topological vector space are non-compact, if one insists on using a reasonably “strong” topology. This causes a difficulty in (among other things) calculus of variations, which is often concerned to finding extremisers to a functional on a subset E of an infinite-dimensional function space X.
In recent decades, mathematicians have found a number of ways to get around this difficulty. One of them is to weaken the topology to recover compactness, taking advantage of such results as the Banach-Alaoglu theorem (or its sequential counterpart). Of course, there is a tradeoff: weakening the topology makes compactness easier to attain, but makes the continuity of F harder to establish. Nevertheless, if F enjoys enough “smoothing” or “cancellation” properties, one can hope to obtain continuity in the weak topology, allowing one to do things such as locate extremisers. (The phenomenon that cancellation can lead to continuity in the weak topology is sometimes referred to as compensated compactness.)
Another option is to abandon trying to make all sequences have convergent subsequences, and settle just for extremising sequences to have convergent subsequences, as this would still be enough to retain Theorem 1. Pursuing this line of thought leads to the Palais-Smale condition, which is a substitute for compactness in some calculus of variations situations.
But in many situations, one cannot weaken the topology to the point where the domain E becomes compact, without destroying the continuity (or semi-continuity) of F, though one can often at least find an intermediate topology (or metric) in which F is continuous, but for which E is still not quite compact. Thus one can find sequences in E which do not have any subsequences that converge to a constant element
, even in this intermediate metric. (As we shall see shortly, one major cause of this failure of compactness is the existence of a non-trivial action of a non-compact group G on E; such a group action can cause compensated compactness or the Palais-Smale condition to fail also.) Because of this, it is a priori conceivable that a continuous function F need not attain its supremum or infimum.
Nevertheless, even though a sequence does not have any subsequences that converge to a constant x, it may have a subsequence (which we also call
) which converges to some non-constant sequence
(in the sense that the distance
between the subsequence and the new sequence in a this intermediate metric), where the approximating sequence
is of a very structured form (e.g. “concentrating” to a point, or “travelling” off to infinity, or a superposition
of several concentrating or travelling profiles of this form). This weaker form of compactness, in which superpositions of a certain type of profile completely describe all the failures (or defects) of compactness, is known as concentration compactness, and the decomposition
of the subsequence is known as the profile decomposition. In many applications, it is a sufficiently good substitute for compactness that one can still do things like locate extremisers for functionals F – though one often has to make some additional assumptions of F to compensate for the more complicated nature of the compactness. This phenomenon was systematically studied by P.L. Lions in the 80s, and found great application in calculus of variations and nonlinear elliptic PDE. More recently, concentration compactness has been a crucial and powerful tool in the non-perturbative analysis of nonlinear dispersive PDE, in particular being used to locate “minimal energy blowup solutions” or “minimal mass blowup solutions” for such a PDE (analogously to how one can use calculus of variations to find minimal energy solutions to a nonlinear elliptic equation); see for instance this recent survey by Killip and Visan.
In typical applications, the concentration compactness phenomenon is exploited in moderately sophisticated function spaces (such as Sobolev spaces or Strichartz spaces), with the failure of traditional compactness being connected to a moderately complicated group G of symmetries (e.g. the group generated by translations and dilations). Because of this, concentration compactness can appear to be a rather complicated and technical concept when it is first encountered. In this note, I would like to illustrate concentration compactness in a simple toy setting, namely in the space of absolutely summable sequences, with the uniform (
) metric playing the role of the intermediate metric, and the translation group
playing the role of the symmetry group G. This toy setting is significantly simpler than any model that one would actually use in practice [for instance, in most applications X is a Hilbert space], but hopefully it serves to illuminate this useful concept in a less technical fashion.
The last two lectures of this course will be on Ratner’s theorems on equidistribution of orbits on homogeneous spaces. Due to lack of time, I will not be able to cover all the material here that I had originally planned; in particular, for an introduction to this family of results, and its connections with number theory, I will have to refer readers to my previous blog post on these theorems. In this course, I will discuss two special cases of Ratner-type theorems. In this lecture, I will talk about Ratner-type theorems for discrete actions (of the integers on nilmanifolds; this case is much simpler than the general case, because there is a simple criterion in the nilmanifold case to test whether any given orbit is equidistributed or not. Ben Green and I had need recently to develop quantitative versions of such theorems for a number-theoretic application. In the next and final lecture of this course, I will discuss Ratner-type theorems for actions of
, which is simpler in a different way (due to the semisimplicity of
, and lack of compact factors).
In this lecture – the final one on general measure-preserving dynamics – we put together the results from the past few lectures to establish the Furstenberg-Zimmer structure theorem for measure-preserving systems, and then use this to finish the proof of the Furstenberg recurrence theorem.
In Lecture 11, we studied compact measure-preserving systems – those systems in which every function
was almost periodic, which meant that their orbit
was precompact in the
topology. Among other things, we were able to easily establish the Furstenberg recurrence theorem (Theorem 1 from Lecture 11) for such systems.
In this lecture, we generalise these results to a “relative” or “conditional” setting, in which we study systems which are compact relative to some factor of
. Such systems are to compact systems as isometric extensions are to isometric systems in topological dynamics. The main result we establish here is that the Furstenberg recurrence theorem holds for such compact extensions whenever the theorem holds for the base. The proof is essentially the same as in the compact case; the main new trick is to not to work in the Hilbert spaces
over the complex numbers, but rather in the Hilbert module
over the (commutative) von Neumann algebra
. (Modules are to rings as vector spaces are to fields.) Because of the compact nature of the extension, it turns out that results from topological dynamics (and in particular, van der Waerden’s theorem) can be exploited to good effect in this argument.
[Note: this operator-algebraic approach is not the only way to understand these extensions; one can also proceed by disintegrating into fibre measures
for almost every
and working fibre by fibre. We will discuss the connection between the two approaches below.]
Recent Comments