In the next few lectures, we will be studying four major classes of function spaces. In decreasing order of generality, these classes are the topological vector spaces, the normed vector spaces, the Banach spaces, and the Hilbert spaces. In order to motivate the discussion of the more general classes of spaces, we will first focus on the most special class – that of (real and complex) Hilbert spaces. These spaces can be viewed as generalisations of (real and complex) Euclidean spaces such as ${\Bbb R}^n$ and ${\Bbb C}^n$ to infinite-dimensional settings, and indeed much of one’s Euclidean geometry intuition concerning lengths, angles, orthogonality, subspaces, etc. will transfer readily to arbitrary Hilbert spaces; in contrast, this intuition is not always accurate in the more general vector spaces mentioned above. In addition to Euclidean spaces, another fundamental example of Hilbert spaces comes from the Lebesgue spaces $L^2(X,{\mathcal X},\mu)$ of a measure space $(X,{\mathcal X},\mu)$. (There are of course many other Hilbert spaces of importance in complex analysis, harmonic analysis, and PDE, such as Hardy spaces ${\mathcal H}^2$, Sobolev spaces $H^s = W^{s,2}$, and the space $HS$ of Hilbert-Schmidt operators, but we will not discuss those spaces much in this course.  Complex Hilbert spaces also play a fundamental role in the foundations of quantum mechanics, being the natural space to hold all the possible states of a quantum system (possibly after projectivising the Hilbert space), but we will not discuss this subject here.)

Hilbert spaces are the natural abstract framework in which to study two important (and closely related) concepts: orthogonality and unitarity, allowing us to generalise familiar concepts and facts from Euclidean geometry such as the Cartesian coordinate system, rotations and reflections, and the Pythagorean theorem to Hilbert spaces. (For instance, the Fourier transform is a unitary transformation and can thus be viewed as a kind of generalised rotation.) Furthermore, the Hodge duality on Euclidean spaces has a partial analogue for Hilbert spaces, namely the Riesz representation theorem for Hilbert spaces, which makes the theory of duality and adjoints for Hilbert spaces especially simple (when compared with the more subtle theory of duality for, say, Banach spaces). Much later (next quarter, in fact), we will see that this duality allows us to extend the spectral theorem for self-adjoint matrices to that of self-adjoint operators on a Hilbert space.

These notes are only the most basic introduction to the theory of Hilbert spaces.  In particular, the theory of linear transformations between two Hilbert spaces, which is perhaps the most important aspect of the subject, is not covered much at all here (but I hope to discuss it further in future lectures.)

— Inner product spaces —

The Euclidean norm

$|(x_1,\ldots,x_n)| := \sqrt{x_1^2+\ldots+x_n^2}$ (1)

in real Euclidean space ${\Bbb R}^n$ can be expressed in terms of the dot product $\cdot: {\Bbb R}^n \times {\Bbb R}^n \to {\Bbb R}$, defined as

$(x_1,\ldots,x_n) \cdot (y_1,\ldots,y_n) := x_1 y_1 + \ldots + x_n y_n$ (2)

by the well-known formula

$|x| = (x \cdot x)^{1/2}$. (3)

In particular, we have the positivity property

$x\cdot x \geq 0$ (4)

with equality if and only if $x=0$.  One reason why it is more advantageous to work with the dot product than the norm is that while the norm function is only sublinear, the dot product is bilinear, thus

$(cx + dy) \cdot z = c (x \cdot z) + d (y \cdot z); z \cdot (cx+dy) = c (z \cdot x) + d (z \cdot y)$ (5)

for all vectors x,y and scalars c,d, and also symmetric,

$x \cdot y = y \cdot x$. (6)

These properties make the inner product easier to manipulate algebraically than the norm.

The above discussion was for the real vector space ${\Bbb R}^n$, but one can develop analogous statements for the complex vector space ${\Bbb C}^n$, in which the norm

$\|(z_1,\ldots,z_n)\| := \sqrt{ |z_1|^2 + \ldots + |z_n|^2 }$ (7)

can be represented in terms of the complex inner product $\langle, \rangle: {\Bbb C}^n \times {\Bbb C}^n \to {\Bbb C}$ defined by the formula

$(z_1,\ldots,z_n) \cdot (w_1,\ldots,w_n) := z_1 \overline{w_1} + \ldots + z_n \overline{w_n}$ (8)

by the analogue of (3), namely

$\| x \| = (\langle x, x \rangle)^{1/2}$. (9)

In particular, as before with (4), we have the positivity property

$\langle x, x \rangle \geq 0$, (10)

with equality if and only if x=0.  The bilinearity property (5) is modified to the sesquilinearity property

$\langle cx + dy, z\rangle = c \langle x, z\rangle + d \langle y, z\rangle; \langle z, cx+dy \rangle = \overline{c} \langle z, x \rangle + \overline{d} \langle z, y \rangle$ (11)

while the symmetry property (6) needs to be replaced with

$\langle x, y \rangle = \overline{\langle y,x \rangle}$ (12)

in order to be compatible with sesquilinearity.

We can formalise all these properties axiomatically as follows.

Definition 1. (Inner product space)  A complex inner product space $(V, \langle,\rangle)$ is a complex vector space V, together with an inner product $\langle,\rangle: V \times V \to {\Bbb C}$ which is sesquilinear (i.e. (11) holds for all $x,y \in V$ and $c,d \in {\Bbb C}$) and symmetric in the sesquilinear sense (i.e. (12) holds for all $x,y \in V$), and obeys the positivity property (10) for all $x \in V$, with equality if and only if x=0.  We will usually abbreviate $(V, \langle,\rangle)$ as V.

A real inner product space is defined similarly, but with all references to ${\Bbb C}$ replaced by ${\Bbb R}$ (and all references to complex conjugation dropped).

Example 1. ${\Bbb R}^n$ with the standard dot product (2) is a real inner product space, and ${\Bbb C}^n$ with the complex inner product (8) is a complex inner product space.  $\diamond$

Example 2. If $(X,{\mathcal X},\mu)$ is a measure space, then the complex $L^2$ space $L^2(X,{\mathcal X},\mu) = L^2(X,{\mathcal X},\mu; {\Bbb C})$ with the complex inner product

$\displaystyle \langle f, g \rangle := \int_X f \overline{g}\ d\mu$ (13)

(which is well defined, by the Cauchy-Schwarz inequality) is easily verified to be a complex inner product space, and similarly for the real $L^2$ space (with the complex conjugate signs dropped, of course.  Note that the finite dimensional examples ${\Bbb R}^n, {\Bbb C}^n$ can be viewed as the special case of the $L^2$ examples in which X is $\{1,\ldots,n\}$ with the discrete $\sigma$-algebra and counting measure.  $\diamond$

Example 3. Any subspace of a (real or complex) inner product space is again a (real or complex) inner product space, simply by restricting the inner product to the subspace. $\diamond$

Example 4. Also, any real inner product space V can be complexified into the complex inner product space $V_{\Bbb C}$, defined as the space of formal combinations $x+iy$ of vectors $x, y \in V$ (with the obvious complex vector space structure), and with inner product

$\displaystyle \langle a+ib, c+id \rangle := \langle a, c \rangle + i \langle b, c \rangle - i \langle a, d \rangle + \langle b, d \rangle$. $\diamond$ (14)

Example 5. Fix a probability space $(X, {\mathcal X}, \mu)$.  The space of square-integrable real-valued random variables of mean zero is an inner product space if one uses covariance as the inner product.  (What goes wrong if one drops the mean zero assumption?) $\diamond$

Given a (real or complex) inner product space V, we can define the norm $\|x\|$ of any vector space V by the formula (9), which is well defined thanks to the positivity property; in the case of the $L^2$ spaces, this norm of course corresponds to the usual $L^2$ norm.  We have the following basic facts:

Lemma 1. Let V be a real or complex inner product space.

1. (Cauchy-Schwarz inequality) For any $x, y \in V$, we have $|\langle x, y \rangle| \leq \|x\| \|y\|$.
2. The function $x \mapsto \|x\|$ is a norm on V.  (Thus every inner product space is a normed vector space.)

Proof. We shall just verify the complex case, as the real case is similar (and slightly easier).  The positivity property tells us that the quadratic form $\langle ax+by, ax+by \rangle$ is non-negative for all complex numbers a, b.  Using sesquilinearity and symmetry, we can expand this form as

$\displaystyle |a|^2 \|x\|^2 + 2 \hbox{Re}(a \overline{b} \langle x, y \rangle) + |b|^2 \|y\|^2$. (15)

Optimising in a, b (see also my blog post on the amplification trick) we obtain the Cauchy-Schwarz inequality.  To verify the norm property, the only non-trivial verification is that of the triangle inequality $\|x+y\| \leq \|x\| + \|y\|$.  But on expanding $\|x+y\|^2 = \langle x+y, x+y \rangle$ we see that

$\displaystyle \|x+y\|^2 = \|x\|^2 + 2 \hbox{Re}( \langle x, y \rangle ) + \|y\|^2$ (16)

and the claim then follows from the Cauchy-Schwarz inequality. $\Box$

Observe from the Cauchy-Schwarz inequality that the inner product $\langle,\rangle: H \times H \to {\Bbb C}$ is continuous.

Exercise 1. Let $T: V \to W$ be a linear map from one (real or complex) inner product space to another.  Show that T preserves the inner product structure (i.e. $\langle Tx, Ty \rangle = \langle x, y \rangle$ for all $x,y \in V$) if and only if T is an isometry (i.e. $\|Tx\| = \|x\|$ for all $x \in V$).  [Hint: in the real case, express $\langle x, y \rangle$ in terms of $\|x+y\|^2$ and $\|x-y\|^2$.  In the complex case, use $x+y, x-y, x+iy, x-iy$ instead of $x+y, x-y$.]  $\diamond$

Inspired by the above exercise, we say that two inner product spaces are isomorphic if there exists an invertible isometry from one space to the other; such invertible isometries are known as isomorphisms.

Exercise 2. Let V be a real or complex inner product space.  If $x_1,\ldots,x_n$ are a finite collection of vectors in V, show that the Gram matrix $( \langle x_i, x_j \rangle )_{1 \leq i,j \leq n}$ is Hermitian and positive semi-definite, and is positive definite if and only if the $x_1,\ldots,x_n$ are linearly independent. Conversely, given a Hermitian positive semi-definite matrix $(a_{ij})_{1 \leq i, j \leq n}$ with real (resp. complex) entries, show that there exists a real (resp. complex) inner product space V and vectors $x_1,\ldots,x_n$ such that $\langle x_i, x_j \rangle = a_{ij}$ for all $1 \leq i,j \leq n$$\diamond$

In analogy with the Euclidean case, we say that two vectors x, y in a (real or complex) vector space are orthogonal if $\langle x, y \rangle = 0$.  (With this convention, we see in particular that 0 is orthogonal to every vector, and is the only vector with this property.)

Exercise 3. (Pythagorean theorem)  Let V be a real or complex inner product space.  If $x_1,\ldots,x_n$ are a finite set of pairwise orthogonal vectors, then $\|x_1+\ldots+x_n\|^2 = \|x_1\|^2 + \ldots + \|x_n\|^2$.  In particular, we see that $\|x_1+x_2\| \geq \|x_1\|$ whenever $x_2$ is orthogonal to $x_1$$\diamond$

A (possibly infinite) collection $(e_\alpha)_{\alpha \in A}$ of vectors in a (real or complex) inner product space is said to be orthonormal if they are pairwise orthogonal and all of unit length.

Exercise 4. Let $(e_\alpha)_{\alpha \in A}$ be an orthonormal system of vectors in a real or complex inner product space.  Show that this system is (algebraically) linearly independent (thus any non-trivial finite linear combination of vectors in this system is non-zero).  If x lies in the algebraic span of this system (i.e. it is a finite linear combination of vectors in the system), establish the inversion formula

$\displaystyle x = \sum_{\alpha \in A} \langle x, e_\alpha \rangle e_\alpha$ (17)

(with only finitely many of the terms non-zero) and the (finite) Plancherel formula

$\displaystyle \|x\|^2 = \sum_{\alpha \in A} |\langle x, e_\alpha \rangle|^2$. $\diamond$ (18)

Exercise 5. (Gram-Schmidt theorem)  Let $e_1,\ldots,e_n$ be a finite orthonormal system in a real or complex inner product space, and let v be a vector not in the span of $e_1,\ldots,e_n$.  Show that there exists a vector $e_{n+1}$ with $\hbox{span}(e_1,\ldots,e_n,e_{n+1}) = \hbox{span}(e_1,\ldots,e_n,v)$ such that $e_1,\ldots,e_{n+1}$ is an orthonormal system.  Conclude that an n-dimensional real or complex inner product space is isomorphic to ${\Bbb R}^n$ or ${\Bbb C}^n$ respectively.  Thus, any statement about inner product spaces which only involves a finite-dimensional subspace of that space can be verified just by checking it on Euclidean spaces. $\diamond$

Exercise 6 (Parallelogram law).  For any inner product space V, establish the parallelogram law

$\displaystyle \|x+y\|^2 + \|x-y\|^2 = 2 \|x\|^2 + 2 \|y\|^2$. (19)

Show that this inequality fails for $L^p(X,{\mathcal X},\mu)$ for $p \neq 2$ as soon as X contains at least two disjoint sets of non-zero finite measure.  On the other hand, establish the Hanner inequalities

$\displaystyle \|f+g\|_p^p + \|f-g\|_p^p \geq (\|f\|_p + \|g\|_p)^p + |\|f\|_p - \|g\|_p|^p$ (20)

and

$\displaystyle (\|f+g\|_p + \|f-g\|_p)^p + |\|f+g\|_p - \|f-g\|_p|^p \leq 2^p (\|f\|_p^p + \|g\|_p^p)$ (21)

for $1 \leq p \leq 2$, with the inequalities being reversed for $2 \leq p < \infty$.  (Hint: (21) can be deduced from (20) by a simple substitution.  For (20), reduce to the case when f,g are non-negative, and then exploit the inequality

$\displaystyle |x+y|^p + |x-y|^p \geq ((1+r)^{p-1} + (1-r)^{p-1}) x^p$

$+ ((1+r)^{p-1} - (1-r)^{p-1}) r^{1-p} y^p$ (22)

for all non-negative x,y, $0 < r < 1$, and $1 \leq p \leq 2$, with the inequality being reversed for $2 \leq p < \infty$, and with equality being attained when $y < x$ and $r=y/x$. $\diamond$

— Hilbert spaces —

Thus far, our discussion of inner product spaces has been largely algebraic in nature; this is because we have not been able to take limits inside these spaces and do some actual analysis.  This can be rectified by adding an additional axiom.

Definition 2. (Hilbert spaces)  A (real or complex) Hilbert space is a (real or complex) inner product space which is complete (or equivalently, an inner product space which is also a Banach space).

Example 6. From Proposition 1 from Notes 3, (real or complex) $L^2(X,{\mathcal X},\mu)$ is a Hilbert space for any measure space $(X,{\mathcal X},\mu)$.  In particular, ${\Bbb R}^n$ and ${\Bbb C}^n$ are Hilbert spaces. $\diamond$

Exercise 7. Show that a subspace of a Hilbert space H will itself be a Hilbert space if and only if it is closed.  (In particular, proper dense subspaces of Hilbert spaces are not Hilbert spaces.)

Example 7. By Example 6, the space $l^2({\Bbb Z})$ of doubly infinite square-summable sequences is a Hilbert space.  Inside this space, the space $c_c({\Bbb Z})$ of sequences of finite support is a proper dense subspace (e.g. by Proposition 2 of Notes 3, though this can also be seen much more directly), and so cannot be a Hilbert space. $\diamond$

Exercise 8. Let V be an inner product space.  Show that there exists a Hilbert space $\overline{V}$ which contains a dense subspace isomorphic to V; we refer to $\overline{V}$ as a completion of V.  Furthermore, this space is essentially unique in the sense that if $\overline{V}$, $\overline{V}'$ are two such completions, then there exists an isomorphism from $\overline{V}$ to $\overline{V}'$ which are the identity on V (if one identifies V with the dense subspaces of $\overline{V}$ and $\overline{V'}$.  Because of this fact, inner product spaces are sometimes known as pre-Hilbert spaces, and can always be identified with dense subspaces of actual Hilbert spaces. $\diamond$

Exercise 9. Let H, H’ be two Hilbert spaces.  Define the direct sum $H \oplus H'$ of the two spaces to be the vector space $H \times H'$ with inner product $\langle (x,x'), (y,y') \rangle_{H \oplus H'} := \langle x, y \rangle_H + \langle x', y' \rangle_{H'}$.  Show that $H \oplus H'$ is also a Hilbert space. $\diamond$

Example 8. If H is a complex Hilbert space, one can define the complex conjugate $\overline{H}$ of that space to be the set of formal conjugates $\{ \overline{x}: x \in H \}$ of vectors in H, with complex vector space structure $\overline{x} + \overline{y} := \overline{x+y}$ and $c \overline{x} := \overline{\overline{c} x}$, and inner product $\langle \overline{x}, \overline{y} \rangle_{\overline{H}} := \langle y, x \rangle_H$.  One easily checks that $\overline{H}$ is again a complex Hilbert space.   Note the map $x \mapsto \overline{x}$ is not a complex linear isometry; instead, it is a complex antilinear isometry.  $\diamond$

A key application of the completeness axiom is to be able to define the “nearest point” from a vector to a closed convex body.

Proposition 1. (Existence of minimisers) Let H be a Hilbert space, let K be a non-empty closed convex subset of H, and let x be a point in H.  Then there exists a unique y in K that minimises the distance $\|y-x\|$ to x.  Furthermore, for any other z in K, we have $\hbox{Re} \langle z-y, y-x \rangle \geq 0$.

Proof. Observe from the parallelogram law (19) the (geometrically obvious) fact that if y and y’ are distinct and equidistant from x, then their midpoint $(y+y')/2$ is strictly closer to x than either of y or y’.   This ensures that the distance minimiser, if it exists, is unique.  Also, if y is the distance minimiser and z is in K, then $(1-\theta) y + \theta z$ is at least as distant from x as y is for any $0 < \theta < 1$, by convexity; squaring this and rearranging, we conclude that

$\displaystyle 2 \hbox{Re} \langle z-y, y-x \rangle + \theta \| z-y\|^2 \geq 0$. (23)

Letting $\theta \to 0$ we obtain the final claim in the proposition.

It remains to show existence.  Write $D := \inf_{y \in K} \|x-y\|$.  It is clear that D is finite and non-negative.  If the infimum is attained then we would be done.  We cannot conclude immediately that this is the case, but we can certainly find a sequence $y_n \in K$ such that $\|x-y_n\| \to D$.  On the other hand, the midpoints $\frac{y_n+y_m}{2}$ lie in K by convexity and so $\|x - \frac{y_n+y_m}{2}\| \geq D$.  Using the parallelogram law (19) we deduce that $\|y_n - y_m\| \to 0$ as $n, m \to \infty$ and so $y_n$ is Cauchy; by completeness, it converges to a limit y, which lies in K since K is closed.  From the triangle inequality we see that $\|x-y_n\| \to \|x-y\|$, and thus $\|x-y\| = D$, and so y is a distance minimiser. $\Box$

Exercise 10. Show by constructing counterexamples that the existence of the distance minimiser y can fail if either the closure or convexity hypothesis on K is dropped, or if H is merely an inner product space rather than a Hilbert space. (Hint: for the last case, let H be the inner product space $C([0,1]) \subset L^2([0,1])$, and let K be the subspace of continuous functions supported on ${}[0,1/2]$.) On the other hand, show that existence (but not uniqueness) can be recovered if K is assumed to be compact rather than convex.  $\diamond$

Exercise 11. Using the Hanner inequalities (Exercise 6), show that Proposition 1 also holds for the $L^p$ spaces as long as $1 < p < \infty$.  (The specific feature of the $L^p$ spaces that is allowing this is known as uniform convexity.) Give counterexamples to show that the propsition can fail for $L^1$ and for $L^\infty$. $\diamond$

This proposition has some importance in calculus of variations, but we will not pursue those applications here.

Since every subspace is necessarily convex, we have a corollary:

Exercise 12. (Orthogonal projections)  Let V be a closed subspace of a Hilbert space H.  Then for every $x \in H$ there exists a unique decomposition $x = x_V + x_{V^\perp}$, where $x_V \in V$ and $x_{V^\perp}$ is orthogonal to every element of V.  Furthermore, $x_V$ is the closest element of V to x. $\diamond$

Let $\pi_V: H \to V$ be the map $\pi_V: x \mapsto x_V$, where $x_V$ is given by the above exercise; we refer to $\pi_V$ as the orthogonal projection from H onto V.  It is not hard to see that $\pi_V$ is linear, and from the Pythagorean theorem we see that $\pi_V$ is a contraction (thus $\|\pi_V x\| \leq \|x\|$ for all $x \in V$).  In particular, $\pi_V$ is continuous.

Exercise 13. (Orthogonal complement) Given a subspace V of a Hilbert space H, define the orthogonal complement $V^\perp$ of V to be the set of all vectors in H that are orthogonal to every element of V.  Establish the following claims:

1. $V^\perp$ is a closed subspace of H, and that $(V^\perp)^\perp$ is the closure of V.
2. $V^\perp$ is the trivial subspace {0} if and only if V is dense.
3. If V is closed, then H is isomorphic to the direct sum of V and $V^\perp$.
4. If V, W are two closed subspaces of H, then $(V + W)^\perp = V^\perp \cap W^\perp$ and $(V \cap W)^\perp = \overline{V^\perp + W^\perp}$. $\diamond$

Every vector $v$ in a Hilbert space gives rise to a continuous linear functional $\lambda_v: H \to {\Bbb C}$, defined by the formula $\lambda_v(w) := \langle w, v \rangle$ (the continuity follows from the Cauchy-Schwarz inequality).  The Riesz representation theorem for Hilbert spaces gives a converse:

Theorem 1. (Riesz representation theorem for Hilbert spaces)  Let H be a complex Hilbert space, and let $\lambda: H \to {\Bbb C}$ be a continuous linear functional on H.  Then there exists a unique v in H such that $\lambda=\lambda_v$.  A similar claim holds for real Hilbert spaces (replacing ${\Bbb C}$ by ${\Bbb R}$ throughout).

Proof. We just show the claim for complex Hilbert spaces, as the claim for real Hilbert spaces is very similar.  First, we show uniqueness: if $\lambda_v = \lambda_{v'}$, then $\lambda_{v-v'}=0$, and in particular $\langle v-v', v-v' \rangle=0$, and so v=v’.

Now we show existence.  We may assume that $\lambda$ is not identically zero, since the claim is obvious otherwise.  Observe that the kernel $V := \{ x \in H: \lambda(x) = 0 \}$ is then a proper subspace of H, which is closed since $\lambda$ is continuous.  By Exercise 13, the orthogonal complement $V^\perp$ must contain at least one non-trivial vector w, which we can normalise to have unit magnitude.  Since w doesn’t lie in V, $\lambda(w)$ is non-zero.  Now observe that for any x in H, $x - \frac{\lambda(x)}{\lambda(w)} w$ lies in the kernel of $\lambda$, i.e. it lies in V.  Taking inner products with w, we conclude that

$\displaystyle \langle x, w\rangle - \frac{\lambda(x)}{\lambda(w)} = 0$ (24)

and thus

$\displaystyle \lambda(x) = \langle x, \overline{\lambda(w)} w \rangle$ (25)

Thus we have $\lambda = \lambda_{\overline{\lambda(w)} w}$, and the claim follows. $\Box$

Remark 1. This result gives an alternate proof of the p=2 case of Theorem 1 from Notes 3, and by modifying Remark 10 from Notes 3, can be used to give an alternate proof of the Lebesgue-Radon-Nikodym theorem (this proof is due to von Neumann).  $\diamond$

Remark 2. In the next set of notes, when we define the notion of a dual space, we can reinterpret the Riesz representation theorem as providing a canonical isomorphism $H^* \equiv \overline{H}$. $\diamond$

Exercise 14. Using Exercise 11, give an alterate proof of the $1 < p < \infty$ case of Theorem 1 from Notes 3. $\diamond$

One important consequence of the Riesz representation theorem is the existence of adjoints:

Exercise 15. Let $T: H \to H'$ be a continuous linear transformation.  Show that that there exists a unique continuous linear transformation $T^\dagger: H' \to H$ with the property that $\langle Tx, y \rangle = \langle x, T^\dagger y \rangle$ for all $x \in H$ and $y \in H'$.  The transformation $T^\dagger$ is called the (Hilbert space) adjoint of T; it is of course compatible with the notion of an adjoint matrix from linear algebra. $\diamond$

Exercise 16. Let $T: H \to H'$ be a continuous linear transformation.

1. Show that $(T^\dagger)^\dagger = T$.
2. Show that T is an isometry if and only if $T^\dagger T = \hbox{id}_H$.
3. Show that T is an isomorphism if and only if $T^\dagger T = \hbox{id}_H$ and $T{T^\dagger}={\hbox{id}}_{H'}$.
4. If $S: H' \to H''$ is another continuous linear transformation, show that $(ST)^\dagger = T^\dagger S^\dagger$. $\diamond$

Remark 3. An isomorphism of Hilbert spaces is also known as a unitary transformation.  (For real Hilbert spaces, the term orthogonal transformation is sometimes used instead.) Note that unitary and orthogonal $n \times n$ matrices generate unitary and orthogonal transformations on ${\Bbb C}^n$ and ${\Bbb R}^n$ respectively.  $\diamond$

Exercise 17. Show that the projection map $\pi_V: H \to V$ from a Hilbert space to a closed subspace is the adjoint of the inclusion map $\iota_V: V \to H$. $\diamond$

— Orthonormal bases —

In the section on inner product spaces, we studied finite linear combinations of orthonormal systems.  Now that we have completeness, we turn to infinite linear combinations.

We begin with countable linear combinations:

Exercise 18. Suppose that $e_1, e_2, e_3, \ldots$ is a countable orthonormal system in a complex Hilbert space H, and $c_1,c_2,\ldots$ is a sequence of complex numbers.  (As usual, similar statements will hold here for real Hilbert spaces and real numbers.)

1. Show that the series $\sum_{n=1}^\infty c_n e_n$ is conditionally convergent in H if and only if $c_n$ is square-summable.
2. If $c_n$ is square-summable, show that $\sum_{n=1}^\infty c_n e_n$ is unconditionally convergent in H, i.e. every permutation of the $c_n e_n$ sums to the same value.
3. Show that the map $(c_n)_{n=1}^\infty \mapsto \sum_{n=1}^\infty c_n e_n$ is an isometry from the Hilbert space $\ell^2({\Bbb N})$ to H.  The image V of this isometry is the smallest closed subspace of H that contains $e_1,e_2,\ldots$, and which we shall therefore call the (Hilbert space) span of $e_1,e_2,\ldots$.
4. Take adjoints of 3. and conclude that for any $x \in H$, we have $\pi_V(x) = \sum_{n=1}^\infty \langle x, e_n \rangle e_n$ and $\| \pi_V(x) \| = (\sum_{n=1}^\infty |\langle x, e_n \rangle|^2)^{1/2}$.   Conclude in particular the Bessel inequality $\sum_{n=1}^\infty |\langle x, e_n \rangle|^2 \leq \|x\|^2$.

Remark 4. Note the contrast here between conditional and unconditional summability (which needs only square-summability of the coefficients $c_n$) and absolute summability (which requires the stronger condition that the $c_n$ are absolutely summable).  In particular there exist non-absolutely summable series that are still unconditionally summable, in contrast to the situation for scalars, in which one has the Riemann rearrangement theorem. $\diamond$

Now we can handle arbitrary orthonormal systems $(e_\alpha)_{\alpha \in A}$.  If $(c_\alpha)_{\alpha \in A}$ is square-summable, then at most countably many of the $c_\alpha$ are non-zero (by Exercise 3 of Notes 3).  Using parts 1,2 of Exercise 18, we can then form the sum $\sum_{\alpha \in A} c_\alpha e_\alpha$ in an unambiguous manner.  It is not hard to use Exercise 18 to then conclude that this gives an isometric embedding of $\ell^2(A)$ into H.  The image of this isometry is the smallest closed subspace of H that contains the orthonormal system, which we call the (Hilbert space) span of that system.  (It is the closure of the algebraic span of the system.)

Exercise 19. Let $(e_\alpha)_{\alpha \in A}$ be an orthonormal system in H.  Show that the following statements are equivalent:

1. The Hilbert space span of $(e_\alpha)_{\alpha \in A}$ is all of H.
2. The algebraic span of $(e_\alpha)_{\alpha \in A}$ (i.e. the finite linear combinations of the $e_\alpha$) is dense in H.
3. One has the Parseval identity $\|x\|^2 = \sum_{\alpha \in A} |\langle x, e_\alpha \rangle|^2$ for all $x \in H$.
4. One has the inversion formula $x = \sum_{\alpha \in A} \langle x, e_\alpha \rangle e_\alpha$ for all $x \in H$ (in particular, the coefficients $\langle x, e_\alpha \rangle$ are square summable).
5. The only vector that is orthogonal to all the $e_\alpha$ is the zero vector.
6. There is an isomorphism from $\ell^2(A)$ to H that maps $\delta_\alpha$ to $e_\alpha$ for all $\alpha \in A$ (where $\delta_\alpha$ is the Kronecker delta at $\alpha$).

A system $(e_\alpha)_{\alpha \in A}$ obeying any (and hence all) of the properties in Exercise 19 is known as an orthonormal basis of the Hilbert space H.  All Hilbert spaces have such a basis:

Proposition 2. Every Hilbert space has at least one orthonormal basis.

Proof. We use the standard Zorn’s lemma argument.  Every Hilbert space has at least one orthonormal system, namely the empty system.  We order the orthonormal systems by inclusion, and observe that the union of any totally ordered set of orthonormal systems is again an orthonormal system.  By Zorn’s lemma, there must exist a maximal orthonormal system $(e_\alpha)_{\alpha \in A}$.  There cannot be any unit vector orthogonal to all the elements of this system, since otherwise one could add that vector to the system and contradict orthogonality.  Applying Exercise 19 in the contrapositive, we obtain an orthonormal basis as claimed. $\Box$

Exercise 19.5. Show that every vector space V has at least one algebraic (or Hamel) basis, i.e. a set of basis vectors such that every vector in V can be expressed uniquely as a finite linear combination of basis vectors. $\diamond$

Corollary 1. Every Hilbert space is isomorphic to $\ell^2(A)$ for some set A.

Exercise 20. Let A, B be sets.  Show that $\ell^2(A)$ and $\ell^2(B)$ are isomorphic iff A and B have the same cardinality.  (Hint: the case when A or B is finite is easy, so suppose A and B are both infinite.  If $\ell^2(A)$ and $\ell^2(B)$ are isomorphic, show that B can be covered by a family of at most countable sets indexed by A, and vice versa.  Then apply the Schroder-Bernstein theorem. $\diamond$

We can now classify Hilbert spaces up to isomorphism by a single cardinal, the dimension of that space:

Exercise 21. Show that all orthonormal bases of a given Hilbert space H have the same cardinality.  This cardinality is called the (Hilbert space) dimension of the Hilbert space. $\diamond$

Exercise 22. Show that a Hilbert space is separable (i.e. has a countable dense subset) if and only if its dimension is at most countable.  Conclude in particular that up to isomorphism, there is exactly one separable infinite-dimensional Hilbert space. $\diamond$

Exercise 23. Let H, H’ be two complex Hilbert spaces.  Show that there exists another Hilbert space $H \otimes H'$, together with a map $\otimes: H \times H' \to H \otimes H'$ with the following properties:

1. The map $\otimes$ is bilinear, thus $(cx+dy) \otimes x' = c (x \otimes x') + d (y \otimes x')$ and $x \otimes (cx'+dy') = c (x \otimes x') + d(x \otimes y')$ for all $x,y \in H, x',y' \in H', c,d \in {\Bbb C}$;
2. We have $\langle x \otimes x', y \otimes y' \rangle_{H \otimes H'} = \langle x, y \rangle_H \langle x', y' \rangle_{H'}$ for all $x,y \in H, x',y' \in H'$.
3. The (algebraic) span of $\{ x \otimes x': x \in H, x' \in H' \}$ is dense in $H \otimes H'$.

Furthermore, show that $H \otimes H'$ and $\otimes$ are unique up to isomorphism in the sense that if $H \tilde \otimes H'$ and $\tilde \otimes: H \times H' \to H \tilde \otimes H'$ are another pair of objects obeying the above properties, then there exists an isomorphism $\Phi: H \otimes H' \to H \tilde \otimes H'$ such that $x \tilde \otimes x' = \Phi( x \otimes x' )$ for all $x \in H, x' \in H'$.  (Hint: to prove existence, create orthonormal bases for H and H’ and take formal tensor products of these bases.)  The space $H \otimes H'$ is called the (Hilbert space) tensor product of H and H’, and $x \otimes x'$ is the tensor product of x and x’. $\diamond$

Exercise 24. Let $(X, {\mathcal X}, \mu)$ and $(Y, {\mathcal Y}, \nu)$ be measure spaces.  Show that $L^2( X \times Y, {\mathcal X} \times {\mathcal Y}, \mu \times \nu )$ is the tensor product of $L^2(X, {\mathcal X}, \mu )$ and $L^2( Y, {\mathcal Y}, \mu )$, if one defines the tensor product $f \otimes g$ of $f \in L^2(X, {\mathcal X}, \mu)$ and $g \in L^2(Y, {\mathcal Y}, \mu)$ as $f \otimes g(x,y) := f(x) g(y)$$\diamond$

We do not yet have enough theory in other areas to give the really useful applications of Hilbert space theory yet, but let us just illustrate a simple one, namely the development of Fourier series on the unit circle ${\Bbb R}/{\Bbb Z}$.  We can give this space the usual Lebesgue measure (identifying the unit circle with [0,1), if one wishes), giving rise to the complex Hilbert space $L^2({\Bbb R}/{\Bbb Z})$.  On this space we can form the characters $e_n(x) := e^{2\pi i nx}$ for all integer n; one easily verifies that $(e_n)_{n \in {\Bbb Z}}$ is an orthonormal system.  We claim that it is in fact an orthonormal basis.  By Exercise 19, it suffices to show that the algebraic span of the $e_n$, i.e. the space of trigonometric polynomials, is dense in $L^2({\Bbb R}/{\Bbb Z})$.  But from an explicit computation (e.g. using Fejér kernels) one can show that the indicator function of any interval can be approximated to arbitrary accuracy in $L^2$ norm by trigonometric polynomials, and is thus in the closure of the trigonometric polynomials.  By linearity, the same is then true of an indicator function of a finite union of intervals; since Lebesgue measurable sets in ${\Bbb R}/{\Bbb Z}$ can be approximated to arbitrary accuracy by finite unions of intervals, the same is true for indicators of measurable sets.  By linearity, the same is true for simple functions, and by density (Proposition 2 of Notes 3) the same is true for arbitrary $L^2$ functions, and the claim follows.

The Fourier transform $\hat f: {\Bbb Z} \to {\Bbb C}$ of a function $f \in L^2({\Bbb R}/{\Bbb Z})$ is defined as

$\displaystyle \hat f(n) := \langle f, e_n \rangle = \int_0^1 f(x) e^{-2\pi i nx}\ dx$.  (26)

From Exercise 19, we obtain the Parseval identity

$\displaystyle \sum_{n \in {\Bbb Z}} |\hat f(n)|^2 = \int_{{\Bbb R}/{\Bbb Z}} |f(x)|^2\ dx$

(in particular, $\hat f \in \ell^2({\Bbb Z})$ and the inversion formula

$\displaystyle f = \sum_{n \in {\Bbb Z}} \hat f(n) e_n$

where the right-hand side is unconditionally convergent.   Indeed, the Fourier transform $f \mapsto \hat f$ is a unitary transformation between $L^2({\Bbb R}/{\Bbb Z})$ and $\ell^2({\Bbb Z})$.  (These facts are collectively referred to as Plancherel’s theorem for the unit circle.)  We will develop Fourier analysis on other spaces than the unit circle later in this course (or next quarter).

Remark 5. Of course, much of the theory here generalises the corresponding theory in finite-dimensional linear algebra; we will continue this theme much later in the course when we turn to the spectral theorem.  However, not every aspect of finite-dimensional linear algebra will carry over so easily.  For instance, it turns out to be quite difficult to take the determinant or trace of a linear transformation from a Hilbert space to itself in general (unless the transformation is particularly well behaved, e.g. of trace class).  The Jordan normal form also does not translate to the infinite-dimensional setting, leading to the notorious invariant subspace problem in the subject.  It is also worth cautioning that while the theory of orthonormal bases in finite-dimensional Euclidean spaces generalises very nicely to the Hilbert space setting, the more general theory of bases in finite dimensions becomes much more subtle in infinite dimensional Hilbert spaces, unless the basis is “almost orthonormal” in some sense (e.g. if it forms a frame).  $\diamond$

[Update, Jan 21: notation for Hilbert space adjoint changed from $T^*$ to $T^\dagger$, for reasons that will become clearer in the next notes.]

[Update, Jan 22: Exercise 19.5 added.]