You are currently browsing the monthly archive for January 2009.

To progress further in our study of function spaces, we will need to develop the standard theory of metric spaces, and of the closely related theory of topological spaces (i.e. point-set topology).  I will be assuming that students in my class will already have encountered these concepts in an undergraduate topology or real analysis course, but for sake of completeness I will briefly review the basics of both spaces here.

I have recently finished a draft version of my blog book “Poincaré’s legacies: pages from year two of a mathematical blog“, which covers all the mathematical posts from my blog in 2008, excluding those posts which primarily originated from other authors or speakers.

The draft is much longer – 694 pages – than the analogous draft from 2007 (which was 374 pages using the same style files).  This is largely because of the two series of course lecture notes which dominate the book (and inspired its title), namely on ergodic theory and on the Poincaré conjecture.  I am talking with the AMS staff about the possibility of splitting the book into two volumes, one focusing on ergodic theory, number theory, and combinatorics, and the other focusing on geometry, topology, and PDE (though there will certainly be miscellaneous sections that will basically be divided arbitrarily amongst the two volumes).

The draft probably also needs an index, which I will attend to at some point before publication.

As in the previous book, those comments and corrections from readers which were of a substantive and mathematical nature have been acknowledged in the text.  In many cases, I was only able to refer to commenters by their internet handles; please email me if you wish to be attributed differently (or not to be attributed at all).

Any other suggestions, corrections, etc. are, of course welcome.

I learned some technical tricks for HTML to LaTeX conversion which made the process significantly faster than last year’s, although still rather tedious and time consuming; I thought I might share them below as they may be of use to anyone else contemplating a similar conversion.

Notational convention: As in Notes 2, I will colour a statement red in this post if it assumes the axiom of choice.  We will, of course, rely on every other axiom of Zermelo-Frankel set theory here (and in the rest of the course).  $\diamond$

In this course we will often need to iterate some sort of operation “infinitely many times” (e.g. to create a infinite basis by choosing one basis element at a time).  In order to do this rigorously, we will rely on Zorn’s lemma:

Zorn’s Lemma. Let $(X, \leq)$ be a non-empty partially ordered set, with the property that every chain (i.e. a totally ordered set) in X has an upper bound.  Then X contains a maximal element (i.e. an element with no larger element).

Indeed, we have used this lemma several times already in previous notes.  Given the other standard axioms of set theory, this lemma is logically equivalent to

Axiom of choice. Let X be a set, and let ${\mathcal F}$ be a collection of non-empty subsets of X.  Then there exists a choice function $f: {\mathcal F} \to X$, i.e. a function such that $f(A) \in A$ for all $A \in {\mathcal F}$.

One implication is easy:

Proof of axiom of choice using Zorn’s lemma. Define a partial choice function to be a pair $({\mathcal F}', f')$, where ${\mathcal F}'$ is a subset of ${\mathcal F}$ and $f': {\mathcal F}' \to X$ is a choice function for ${\mathcal F'}$.  We can partially order the collection of partial choice functions by writing $({\mathcal F}', f') \leq ({\mathcal F}'', f'')$ if ${\mathcal F}' \subset {\mathcal F}''$ and f” extends f’.  The collection of partial choice functions is non-empty (since it contains the pair $(\emptyset, ())$ consisting of the empty set and the empty function), and it is easy to see that any chain of partial choice functions has an upper bound (formed by gluing all the partial choices together).  Hence, by Zorn’s lemma, there is a maximal partial choice function $({\mathcal F}_*, f_*)$.  But the domain ${\mathcal F}_*$ of this function must be all of ${\mathcal F}$, since otherwise one could enlarge ${\mathcal F}_*$ by a single set A and extend $f_*$ to A by choosing a single element of A.  (One does not need the axiom of choice to make a single choice, or finitely many choices; it is only when making infinitely many choices that the axiom becomes necessary.)  The claim follows. $\Box$

In the rest of these notes I would like to supply the reverse implication, using the machinery of well-ordered sets.  Instead of giving the shortest or slickest proof of Zorn’s lemma here, I would like to take the opportunity to place the lemma in the context of several related topics, such as ordinals and transfinite induction, noting that much of this material is in fact independent of the axiom of choice.  The material here is standard, but for the purposes of this course one may simply take Zorn’s lemma as a “black box” and not worry about the proof, so this material is optional.

dWhen studying a mathematical space X (e.g. a vector space, a topological space, a manifold, a group, an algebraic variety etc.), there are two fundamentally basic ways to try to understand the space:

1. By looking at subobjects in X, or more generally maps $f: Y \to X$ from some other space Y into X.  For iTnstance, a point in a space X can be viewed as a map from $pt$ to X; a curve in a space X could be thought of as a map from ${}[0,1]$ to X; a group G can be studied via its subgroups K, and so forth.
2. By looking at objects on X, or more precisely maps $f: X \to Y$ from X into some other space Y.  For instance, one can study a topological space X via the real- or complex-valued continuous functions $f \in C(X)$ on X; one can study a group G via its quotient groups $\pi: G \to G/H$; one can study an algebraic variety V by studying the polynomials on V (and in particular, the ideal of polynomials that vanish identically on V); and so forth.

(There are also more sophisticated ways to study an object via its maps, e.g. by studying extensions, joinings, splittings, universal lifts, etc.  The general study of objects via the maps between them is formalised abstractly in modern mathematics as category theory, and is also closely related to homological algebra.)

A remarkable phenomenon in many areas of mathematics is that of (contravariant) duality: that the maps into and out of one type of mathematical object X can be naturally associated to the maps out of and into a dual object $X^*$ (note the reversal of arrows here!).  In some cases, the dual object $X^*$ looks quite different from the original object X.  (For instance, in Stone duality, discussed in Notes 4, X would be a Boolean algebra (or some other partially ordered set) and $X^*$ would be a compact totally disconnected Hausdorff space (or some other topological space).)   In other cases, most notably with Hilbert spaces as discussed in Notes 5, the dual object $X^*$ is essentially identical to X itself.

In these notes we discuss a third important case of duality, namely duality of normed vector spaces, which is of an intermediate nature to the previous two examples: the dual $X^*$ of a normed vector space turns out to be another normed vector space, but generally one which is not equivalent to X itself (except in the important special case when X is a Hilbert space, as mentioned above).  On the other hand, the double dual $(X^*)^*$ turns out to be closely related to X, and in several (but not all) important cases, is essentially identical to X.  One of the most important uses of dual spaces in functional analysis is that it allows one to define the transpose $T^*: Y^* \to X^*$ of a continuous linear operator $T: X \to Y$.

A fundamental tool in understanding duality of normed vector spaces will be the Hahn-Banach theorem, which is an indispensable tool for exploring the dual of a vector space.  (Indeed, without this theorem, it is not clear at all that the dual of a non-trivial normed vector space is non-trivial!)  Thus, we shall study this theorem in detail in these notes concurrently with our discussion of duality.

Below the fold, I am giving some sample questions for the 245B midterm next week.  These are drawn from my previous 245A and 245B exams (with some modifications), and the solutions can be found by searching my previous web pages for those courses.  (The homework assignments are, of course, another good source of practice problems.)  Note that the actual midterm questions are likely to be somewhat shorter than the ones provided here (this is particularly the case for those questions with multiple parts).  More info on the midterm can  be found at the class web page, of course.

(These questions are of course primarily intended for my students than for my regular blog readers; but anyone is welcome to comment if they wish.)

I’ve just uploaded to the arXiv my paper “The high exponent limit $p \to \infty$ for the one-dimensional nonlinear wave equation“, submitted to Analysis & PDE.  This paper concerns an under-explored limit for the Cauchy problem

$\displaystyle -\phi_{tt} + \phi_{xx} = |\phi|^{p-1} \phi; \quad \phi(0,x) = \phi_0(x); \quad \phi_t(0,x) = \phi_1(x)$ (1)

to the one-dimensional defocusing nonlinear wave equation, where $\phi: {\Bbb R} \times {\Bbb R} \to {\Bbb R}$ is the unknown scalar field, $p > 1$ is an exponent, and $\phi_0, \phi_1: {\Bbb R} \to {\Bbb R}$ are the initial position and velocity respectively, and the t and x subscripts denote differentiation in time and space.  To avoid some (extremely minor) technical difficulties let us assume that p is an odd integer, so that the nonlinearity is smooth; then standard energy methods, relying in particular on the conserved energy

$\displaystyle E(\phi)(t) = \int_{\Bbb R} \frac{1}{2} |\phi_t(t,x)|^2 + \frac{1}{2} |\phi_x(t,x)|^2 + \frac{1}{p+1} |\phi(t,x)|^{p+1}\ dx$, (2)

on finite speed of propagation, and on the one-dimensional Sobolev embedding $H^1({\Bbb R}) \subset L^\infty({\Bbb R})$, show that from any smooth initial data $\phi_0, \phi_1$, there is a unique global smooth solution $\phi$ to the Cauchy problem (1).

It is then natural to ask how the solution $\phi$ behaves under various asymptotic limits.  Popular limits for these sorts of PDE include the asymptotic time limit $t \to \pm \infty$, the non-relativistic limit $c \to \infty$ (where we insert suitable powers of c into various terms in (1)), the small dispersion limit (where we place a small factor in front of the dispersive term $+\phi_{xx}$), the high-frequency limit (where we send the frequency of the initial data $\phi_0, \phi_1$ to infinity), and so forth.

Tristan Roy recently posed to me a different type of limit, which to the best of my knowledge has not been explored much in the literature (although some of the literature on limits of the Ginzburg-Landau equation has a somewhat similar flavour): the high exponent limit $p \to \infty$ (holding the initial data $\phi_0, \phi_1$ fixed).  From (1) it is intuitively plausible that as p increases, the nonlinearity gets “stronger” when $|\phi| > 1$ and “weaker” when $|\phi| < 1$; the “limiting equation”

$\displaystyle -\phi_{tt} + \phi_{xx} = |\phi|^{\infty} \phi; \quad \phi(0,x) = \phi_0(x); \quad \phi_t(0,x) = \phi_1(x)$ (3)

would then be expected to be linear when $|\phi| < 1$ and infinitely repulsive when $|\phi| > 1$ (i.e. in the limit, the solution should be confined to range in the interval [-1,1], much as is the case with linear wave and Schrödinger equations with an infinite barrier potential; though with the key difference that the nonlinear barrier in (3) is confining the range of $\phi$ rather than the domain.).

Of course, the equation (3) does not make rigorous sense as written; we need to formalise what an “infinite nonlinear barrier” is, and how the wave $\phi$ will react to that barrier (e.g. will it reflect off of it, or be absorbed?).  So the questions are to find the correct description of the limiting equation, and to rigorously demonstrate that solutions to (1) converge in some sense to that equation.

It is natural to require that $\phi_0$ stays away from the barrier, in the sense that $|\phi_0(x)| < 1$ for all x; in particular this implies that the energy (2) stays (locally) bounded as $p \to \infty$; it also ensures that (1) converges in a satisfactory sense to the free wave equation for sufficiently short times.  For technical reasons we also have to make a mild assumption that either of the null energy densities $\phi_1 \pm \partial_x \phi_0$ vanish on a set with at most finitely many connected components.  The main result is then that as $p \to \infty$, the solution $\phi = \phi^{(p)}$ to (1) converges locally uniformly to a Lipschitz, piecewise smooth limit $\phi = \phi^{(\infty)}$, which is restricted to take values in [-1,1], with $-\phi_{tt}+\phi_{xx}$ (interpreted in a weak sense) being a negative measure supported on $\{ \phi=+1\}$ plus a positive measure supported on $\{\phi = -1\}$.  Furthermore, we have the reflection conditions

$\displaystyle (\partial_t \pm \partial_x) |\phi_t \mp \phi_x| = 0$.

It turns out that the above conditions uniquely determine $\phi$, and one can even solve for $\phi$ explicitly for any given data; such solutions start off smooth but pick up an increasing number of (Lipschitz continuous) singularities over time as they reflect back and forth across the nonlinear barriers $\{\phi=+1\}$ and $\{\phi=-1\}$.  (An explicit example of such a reflection is given in the paper.)

[The above conditions vaguely resemble entropy conditions, as appear for instance in kinetic formulations of conservation laws, though I do not know of a precise connection in this regard.]

In the remainder of this post I would like to describe the strategy of proof and one of the key a priori bounds needed.  I also want to point out the connection to Liouville’s equation, which was discussed in the previous post.

As is well known, the linear one-dimensional wave equation

$\displaystyle -\phi_{tt}+\phi_{xx} = 0$, (1)

where $\phi: {\Bbb R} \times {\Bbb R} \to {\Bbb R}$ is the unknown field (which, for simplicity, we assume to be smooth), can be solved explicitly; indeed, the general solution to (1) takes the form

$\displaystyle \phi(t,x) = f( t+x ) + g(t-x)$ (2)

for some arbitrary (smooth) functions $f, g: {\Bbb R} \to {\Bbb R}$.  (One can of course determine f and g once one specifies enough initial data or other boundary conditions, but this is not the focus of my post today.)

When one moves from linear wave equations to nonlinear wave equations, then in general one does not expect to have a closed-form solution such as (2).  So I was pleasantly surprised recently while playing with the nonlinear wave equation

$\displaystyle -\phi_{tt}+\phi_{xx} = e^\phi$, (3)

to discover that this equation can also be explicitly solved in closed form.  (I hope to explain why I was interested in (3) in the first place in a later post.)

A posteriori, I now know the reason for this explicit solvability; (3) is the limiting case $a = 0, b \to -\infty$ of the more general equation

$\displaystyle -\phi_{tt}+\phi_{xx} = e^{\phi+a} - e^{-\phi+b}$

which (after applying the simple transformation $\phi = \frac{b-a}{2} + \psi( \sqrt{2} e^{\frac{a+b}{4}} t, \sqrt{2} e^{\frac{a+b}{4}} x)$) becomes the sinh-Gordon equation

$\displaystyle -\psi_{tt} + \psi_{xx} = \sinh(\psi)$

(a close cousin of the more famous sine-Gordon equation $-\phi_{tt} + \phi_{xx} = \sin(\phi)$), which is known to be completely integrable, and exactly solvable.  However, I only realised this after the fact, and stumbled upon the explicit solution to (3) by much more classical and elementary means.  I thought I might share the computations here, as I found them somewhat cute, and seem to serve as an example of how one might go about finding explicit solutions to PDE in general; accordingly, I will take a rather pedestrian approach to describing the hunt for the solution, rather than presenting the shortest or slickest route to the answer.

[The computations do seem to be very classical, though, and thus presumably already in the literature; if anyone knows of a place where the solvability of (3) is discussed, I would be very happy to learn of it.]  [Update, Jan 22: Patrick Dorey has pointed out that (3) is, indeed, extremely classical; it is known as Liouville’s equation and was solved by Liouville in J. Math. Pure et Appl. vol 18 (1853), 71-74, with essentially the same solution as presented here.]

Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our paper “An inverse theorem for the uniformity seminorms associated with the action of $F^\infty_p$“. This paper establishes the ergodic inverse theorems that are needed in our other recent paper to establish the inverse conjecture for the Gowers norms over finite fields in high characteristic (and to establish a partial result in low characteristic), as follows:

Theorem. Let ${\Bbb F}$ be a finite field of characteristic p.  Suppose that $X = (X,{\mathcal B},\mu)$ is a probability space with an ergodic measure-preserving action $(T_g)_{g \in {\Bbb F}^\omega}$ of ${\Bbb F}^\omega$.  Let $f \in L^\infty(X)$ be such that the Gowers-Host-Kra seminorm $\|f\|_{U^k(X)}$ (defined in a previous post) is non-zero.

1. In the high-characteristic case $p \geq k$, there exists a phase polynomial g of degree <k (as defined in the previous post) such that $|\int_X f \overline{g}\ d\mu| > 0$.
2. In general characteristic, there exists a phase polynomial of degree <C(k) for some C(k) depending only on k such that $|\int_X f \overline{g}\ d\mu| > 0$.

This theorem is closely analogous to a similar theorem of Host and Kra on ergodic actions of ${\Bbb Z}$, in which the role of phase polynomials is played by functions that arise from nilsystem factors of X.  Indeed, our arguments rely heavily on the machinery of Host and Kra.

The paper is rather technical (60+ pages!) and difficult to describe in detail here, but I will try to sketch out (in very broad brush strokes) what the key steps in the proof of part 2 of the theorem are.  (Part 1 is similar but requires a more delicate analysis at various stages, keeping more careful track of the degrees of various polynomials.)

In the next few lectures, we will be studying four major classes of function spaces. In decreasing order of generality, these classes are the topological vector spaces, the normed vector spaces, the Banach spaces, and the Hilbert spaces. In order to motivate the discussion of the more general classes of spaces, we will first focus on the most special class – that of (real and complex) Hilbert spaces. These spaces can be viewed as generalisations of (real and complex) Euclidean spaces such as ${\Bbb R}^n$ and ${\Bbb C}^n$ to infinite-dimensional settings, and indeed much of one’s Euclidean geometry intuition concerning lengths, angles, orthogonality, subspaces, etc. will transfer readily to arbitrary Hilbert spaces; in contrast, this intuition is not always accurate in the more general vector spaces mentioned above. In addition to Euclidean spaces, another fundamental example of Hilbert spaces comes from the Lebesgue spaces $L^2(X,{\mathcal X},\mu)$ of a measure space $(X,{\mathcal X},\mu)$. (There are of course many other Hilbert spaces of importance in complex analysis, harmonic analysis, and PDE, such as Hardy spaces ${\mathcal H}^2$, Sobolev spaces $H^s = W^{s,2}$, and the space $HS$ of Hilbert-Schmidt operators, but we will not discuss those spaces much in this course.  Complex Hilbert spaces also play a fundamental role in the foundations of quantum mechanics, being the natural space to hold all the possible states of a quantum system (possibly after projectivising the Hilbert space), but we will not discuss this subject here.)

Hilbert spaces are the natural abstract framework in which to study two important (and closely related) concepts: orthogonality and unitarity, allowing us to generalise familiar concepts and facts from Euclidean geometry such as the Cartesian coordinate system, rotations and reflections, and the Pythagorean theorem to Hilbert spaces. (For instance, the Fourier transform is a unitary transformation and can thus be viewed as a kind of generalised rotation.) Furthermore, the Hodge duality on Euclidean spaces has a partial analogue for Hilbert spaces, namely the Riesz representation theorem for Hilbert spaces, which makes the theory of duality and adjoints for Hilbert spaces especially simple (when compared with the more subtle theory of duality for, say, Banach spaces). Much later (next quarter, in fact), we will see that this duality allows us to extend the spectral theorem for self-adjoint matrices to that of self-adjoint operators on a Hilbert space.

These notes are only the most basic introduction to the theory of Hilbert spaces.  In particular, the theory of linear transformations between two Hilbert spaces, which is perhaps the most important aspect of the subject, is not covered much at all here (but I hope to discuss it further in future lectures.)

In his final lecture, Prof. Margulis talked about some of the ideas around the theory of unipotent flows on homogeneous spaces, culminating in the orbit closure, equidsitribution, and measure classification theorems of Ratner in the subject.  Margulis also discussed the application to metric theory of Diophantine approximation which was not covered in the preceding lecture.