dWhen studying a mathematical space X (e.g. a vector space, a topological space, a manifold, a group, an algebraic variety etc.), there are two fundamentally basic ways to try to understand the space:

  1. By looking at subobjects in X, or more generally maps f: Y \to X from some other space Y into X.  For iTnstance, a point in a space X can be viewed as a map from pt to X; a curve in a space X could be thought of as a map from {}[0,1] to X; a group G can be studied via its subgroups K, and so forth.
  2. By looking at objects on X, or more precisely maps f: X \to Y from X into some other space Y.  For instance, one can study a topological space X via the real- or complex-valued continuous functions f \in C(X) on X; one can study a group G via its quotient groups \pi: G \to G/H; one can study an algebraic variety V by studying the polynomials on V (and in particular, the ideal of polynomials that vanish identically on V); and so forth.

(There are also more sophisticated ways to study an object via its maps, e.g. by studying extensions, joinings, splittings, universal lifts, etc.  The general study of objects via the maps between them is formalised abstractly in modern mathematics as category theory, and is also closely related to homological algebra.)

A remarkable phenomenon in many areas of mathematics is that of (contravariant) duality: that the maps into and out of one type of mathematical object X can be naturally associated to the maps out of and into a dual object X^* (note the reversal of arrows here!).  In some cases, the dual object X^* looks quite different from the original object X.  (For instance, in Stone duality, discussed in Notes 4, X would be a Boolean algebra (or some other partially ordered set) and X^* would be a compact totally disconnected Hausdorff space (or some other topological space).)   In other cases, most notably with Hilbert spaces as discussed in Notes 5, the dual object X^* is essentially identical to X itself.

In these notes we discuss a third important case of duality, namely duality of normed vector spaces, which is of an intermediate nature to the previous two examples: the dual X^* of a normed vector space turns out to be another normed vector space, but generally one which is not equivalent to X itself (except in the important special case when X is a Hilbert space, as mentioned above).  On the other hand, the double dual (X^*)^* turns out to be closely related to X, and in several (but not all) important cases, is essentially identical to X.  One of the most important uses of dual spaces in functional analysis is that it allows one to define the transpose T^*: Y^* \to X^* of a continuous linear operator T: X \to Y.

A fundamental tool in understanding duality of normed vector spaces will be the Hahn-Banach theorem, which is an indispensable tool for exploring the dual of a vector space.  (Indeed, without this theorem, it is not clear at all that the dual of a non-trivial normed vector space is non-trivial!)  Thus, we shall study this theorem in detail in these notes concurrently with our discussion of duality.

— Duality —

In the category of normed vector spaces, the natural notion of a “map” (or morphism) between two such spaces is that of a continuous linear transformation T: X \to Y between two normed vector spaces X, Y.  By Lemma 1 from Notes 3, any such linear transformation is bounded, in the sense that there exists a constant C such that \|Tx\|_Y \leq C\|x\|_X for all x \in X.  The least such constant C is known as the operator norm of T, and is denoted \|T\|_{op} or simply \|T\|.

Two normed vector spaces X, Y are equivalent if there is an invertible continuous linear transformation T: X \to Y from X to Y, thus T is bijective and there exist constants C, c > 0 such that c\|x\|_X \leq \|Tx\|_Y \leq C\|x\|_X for all x \in X.  If one can take C=c=1, then T is an isometry, and X and Y are called isomorphic.    When one has two norms \|\|_1, \| \|_2 on the same vector space X, we say that the norms are equivalent if the identity from (X, \| \|_1) to (X, \| \|_2) is an invertible continuous transformation, i.e. that there exist constants C, c > 0 such that c\|x\|_1 \leq\|x\|_2 \leq C \|x\|_1 for all x \in X.

Exercise 1. Show that all linear transformations from a finite-dimensional space to a normed vector space are continuous.  Conclude that all norms on a finite-dimensional space are equivalent.  \diamond

Let B(X \to Y) denote the space of all continuous linear transformations from X to Y.  (This space is also denoted by many other names, e.g. {\mathcal L}(X,Y), \hbox{Hom}(X \to Y), etc.)  This has the structure of a vector space: the sum S+T: x \mapsto Sx+Tx of two continuous linear transformations is another continuous linear transformation, as is the scalar multiple cT: x \mapsto cTx of a linear transformation.

Exercise 2. Show that B(X \to Y) with the operator norm is a normed vector space.  If Y is complete (i.e. is a Banach space), show that B(X \to Y) is also complete (i.e. is also a Banach space).  \diamond

Exercise 3. Let X, Y, Z be Banach spaces.  Show that if T \in B(X \to Y) and S \in B(Y \to Z), then the composition ST: X \to Z lies in B(X \to Z) and \|ST\|_{op} \leq \|S\|_{op} \|T\|_{op}.  (As a consequence of this inequality, we see that B(X \to X) is a Banach algebra.) \diamond

Now we can define the notion of a dual space.

Definition 1. (Dual space)  Let X be a normed vector space.  The (continuous) dual space X^* of X is defined to be X^* := B( X \to {\Bbb R} ) if X is a real vector space, and X^* := B(X \to {\Bbb C}) if X is a complex vector space.  Elements of X^* are known as continuous linear functionals (or bounded linear functionals) on X.

Remark 1. If one drops the requirement that the linear functionals be continuous, we obtain the algebraic dual space of linear functionals on X.  This space does not play a significant role in functional analysis, though. \diamond

From Exercise 2, we see that the dual of any normed vector space is a Banach space, and so duality is arguably a Banach space notion rather than a normed vector space notion.  The following exercise reinforces this:

Exercise 4. We say that a normed vector space X has a completion \overline{X} if \overline{X} is a Banach space and X can be identified with a dense subspace of \overline{X} (cf. Exercise 8 of Notes 5).

  1. Show that every normed vector space X has at least one completion \overline{X}, and that any two completions \overline{X}, \overline{X}' are isomorphic in the sense that there exists an isomorphism from \overline{X} to \overline{X}' which is the identity on X.
  2. Show that the dual spaces X^* and (\overline{X})^* are isomorphic to each other. \diamond

The next few exercises are designed to give some intuition as to how dual spaces work.

Exercise 5. Let {\Bbb R}^n be given the Euclidean metric.  Show that ({\Bbb R}^n)^* is isomorphic to {\Bbb R}^n.  Establish the corresponding result for the complex spaces {\Bbb C}^n. \diamond

Exercise 6. Let c_c({\Bbb N}) be the vector space of sequences (a_n)_{n \in {\Bbb N}} of real or complex numbers which are compactly supported (i.e. at most finitely many of the a_n are non-zero).  We give c_c the uniform norm \| \|_{\ell^\infty}.

  1. Show that the dual space c_c({\Bbb N})^* is isomorphic to \ell^1({\Bbb N}).
  2. Show that the completion of c_c({\Bbb N}) is isomorphic to c_0({\Bbb N}), the space of sequences on {\Bbb N} that go to zero at infinity (again with the uniform norm); thus, by Exercise 4, the dual space of c_0({\Bbb N}) is isomorphic to \ell^1({\Bbb N}) also.
  3. On the other hand, show that the dual of \ell^1({\Bbb N}) is isomorphic to \ell^\infty({\Bbb N}), a space which is strictly larger than c_c({\Bbb N}) or c_0({\Bbb N}).  Thus we see that the double dual of a Banach space can be strictly larger than the space itself.  \diamond

Exercise 7. Let H be a real or complex Hilbert space.  Using the Riesz representation theorem for Hilbert spaces (Theorem 1 from Notes 5), show that the dual space H^* is isomorphic (as a normed vector space) to the conjugate space \overline{H} (see Example 8 from Notes 5), with an element g \in \overline{H} being identified with the linear functional f \mapsto \langle f, g \rangle.  Thus we see that Hilbert spaces are essentially self-dual (if we ignore the pesky conjugation sign). \diamond

Exercise 8. Let (X,{\mathcal X},\mu) be a \sigma-finite measure space, and let 1 \leq p < \infty.  Using Theorem 1 from Notes 3, show that the dual space of L^p(X, {\mathcal X},\mu) is isomorphic to L^{p'}(X, {\mathcal X}, \mu), with an element g \in L^{p'}(X,{\mathcal X},\mu) being identified with the linear functional f \mapsto \int_X f g \ d\mu.  (The one tricky thing to verify is that the identification is an isometry, but this can be seen by a closer inspection of the proof of Theorem 1 from Notes 3.  The \sigma-finite hypothesis can be dropped when p>1, though we will not need this fact.) \diamond

One of the key purposes of introducing the notion of a dual space is that it allows one to define the notion of a transpose.

Definition 2. (Transpose) Let T: X \to Y be a continuous linear transformation from one normed vector space X to another Y.  The transpose T^*: Y^* \to X^* of T is defined to be the map that sends any continuous linear functional \lambda \in Y^* to the linear functional T^* \lambda := \lambda \circ T \in X^*, thus (T^* \lambda)( x ) = \lambda(Tx) for all x \in X.

Exercise 9. Show that the transpose T^* of a continuous linear transformation T between normed vector spaces is again a continuous linear transformation with \|T^*\|_{op} \leq \|T\|_{op}, thus the transpose operation is itself a linear map from B(X \to Y) to B(Y^* \to X^*).  (We will improve this result in Theorem 3 below.)  \diamond

Exercise 10. An n \times m matrix A with complex entries can be identified with a linear transformation L_A: {\Bbb C}^n \to {\Bbb C}^m.  Identifying the dual space of {\Bbb C}^n with itself as in Exercise 5, show that the transpose L_A^*: {\Bbb C}^m \to {\Bbb C}^n is equal to L_{A^t}, where A^t is the transpose matrix of A. \diamond

Exercise 11. Show that the transpose of a surjective continuous linear transformation between normed vector spaces is injective.  Show that the condition of surjectivity can be relaxed to that of having a dense image. \diamond

Remark 3. Observe that if T: X \to Y and S: Y \to Z are continuous linear transformations between normed vector spaces, then (ST)^* = T^* S^*.  In the language of category theory, this means that duality X \mapsto X^* of normed vector spaces, and transpose T \mapsto T^* of continuous linear transformations, form a contravariant functor from the category of normed vector spaces (or Banach spaces) to itself.  \diamond

Remark 4. The transpose T^*: \overline{H'} \to \overline{H} of a continuous linear transformation T: H \to H' between complex Hilbert spaces is closely related to the adjoint T^\dagger: H' \to H of that transformation, as defined in Exercise 15 of Notes 5, by using the obvious (antilinear) identifications between H and \overline{H}, and between H’ and \overline{H'}.  This is analogous to the linear algebra fact that the adjoint matrix is the complex conjugate of the transpose matrix.   One should note that in the literature, the transpose operator T^* is also (somewhat confusingly) referred to as the adjoint of T.  Of course, for real vector spaces, there is no distinction between transpose and adjoint.  \diamond

— The Hahn-Banach theorem —

Thus far, we have defined the dual space X^*, but apart from some concrete special cases (Hilbert spaces, L^p spaces, etc.) we have not been able to say much about what X^* consists of – it is not even clear yet that if X is non-trivial (i.e. not just \{0\}), that X^* is also non-trivial – for all one knows, there could be no non-trivial continuous linear functionals on X at all!  The Hahn-Banach theorem is used to resolve this, by providing a powerful means to construct continuous linear functionals as needed.

Theorem 1. (Hahn-Banach theorem)  Let X be a normed vector space, and let Y be a subspace of X.  Then any continuous linear functional \lambda \in Y^* on Y can be extended to a continuous linear functional \tilde \lambda \in X^* on X with the same operator norm; thus \tilde \lambda agrees with \lambda on Y and \| \tilde \lambda \|_{X^*} = \| \lambda \|_{Y^*}.  (Note: the extension \tilde \lambda is, in general, not unique.)

We prove this important theorem in stages.  We first handle the codimension one real case:

Proposition 1. The Hahn-Banach theorem is true when X, Y are real vector spaces, and X is spanned by Y and an additional vector v.

Proof. We can assume that v lies outside Y, since the claim is trivial otherwise.  We can also normalise \|\lambda \|_{Y^*} = 1 (the claim is of course trivial if \|\lambda \|_{Y^*} vanishes).  To specify the extension \tilde \lambda of \lambda, it suffices by linearity to specify the value of \tilde \lambda(v).  In order for the extension \tilde \lambda to continue to have operator norm 1, we require that

|\tilde \lambda( y + tv )| \leq \| y + tv \|_X

for all t \in {\Bbb R} and y \in Y.  This is automatic for t=0, so by homogeneity it suffices to attain this bound for t=1.  We rearrange this a bit as

\sup_{y' \in Y} -\lambda(y') - \|y'+v\|_X \leq \tilde \lambda(v) \leq \inf_{y \in Y} \| y+v \|_X - \lambda(y).

But as \lambda has operator norm 1, an application of the triangle inequality shows that the inf on the RHS is at least as large as the sup on the LHS, and so one can choose \tilde \lambda(v) obeying the required properties. \Box

Corollary 1. The Hahn-Banach theorem is true when X, Y are real normed vector spaces.

Proof. This is a standard “Zorn’s lemma” argument.  Fix Y, X, \lambda.  Define a partial extension of \lambda to be a pair (Y', \lambda'), where Y’ is an intermediate subspace between Y and X, and \lambda' is an extension of \lambda with the same operator norm as \lambda.  The set of all partial extensions is partially ordered by declaring (Y'',\lambda'') \geq (Y',\lambda') if Y'' contains Y' and \lambda'' extends \lambda'. It is easy to see that every chain of partial extensions has an upper bound; hence, by Zorn’s lemma, there must be a maximal partial extension (Y_*,\lambda_*).  If Y_*=X, we are done; otherwise, one can find v \in X \backslash Y_*.  By Proposition 1, we can then extend \lambda_* further to the larger space spanned by Y_* and v, a contradiction; and the claim follows. \Box

Remark 5. Of course, this proof of the Hahn-Banach theorem relied on the axiom of choice (via Zorn’s lemma) and is thus non-constructive.  It turns out that this is, to some extent, necessary: it is not possible to prove the Hahn-Banach theorem if one deletes the axiom of choice from the axioms of set theory (although it is possible to deduce the theorem from slightly weaker versions of this axiom, such as the ultrafilter lemma). \diamond

Finally, we establish the complex case by leveraging the real case.

Proof of Hahn-Banach theorem (complex case). Let \lambda: Y \to {\Bbb C} be a continuous complex-linear functional, which we can normalise to have operator norm 1.  Then the real part \rho := \hbox{Re}(\lambda): Y \to {\Bbb R} is a continuous real-linear functional on Y (now viewed as a real normed vector space rather than a complex one), which has operator norm at most 1 (in fact, it is equal to 1, though we will not need this).  Applying Corollary 1, we can extend this real-linear functional \rho to a continuous real-linear functional \tilde \rho: X \to {\Bbb R} on X (again viewed now just as a real normed vector space) of norm at most 1.

To finish the job, we have to somehow complexify \tilde \rho to a complex-linear functional \tilde \lambda: X \to {\Bbb C} of norm at most 1 that agrees with \lambda on Y.  It is reasonable to expect that \hbox{Re} \tilde \lambda = \tilde \rho; a bit of playing around with complex linearity then forces

\tilde \lambda(x) := \tilde \rho(x) - i \tilde \rho(ix). (1)

Accordingly, we shall use (1) to define \tilde \lambda.  It is easy to see that \tilde \lambda is a continuous complex-linear functional agreeing with \lambda on Y.  Since \tilde \rho has norm at most 1, we have |\hbox{Re} \tilde \lambda(x)| \leq \|x\|_X for all x \in X.  We can amplify this by exploiting phase rotation symmetry, thus |\hbox{Re} \tilde \lambda(e^{i\theta} x)| \leq \|x\|_X for all \theta \in {\Bbb R}.  Optimising in \theta we see that \tilde \lambda has norm at most 1, as required. \Box

Exercise 12. In the special case when X is a Hilbert space, give an alternate proof of the Hahn-Banach theorem using the material from Notes 5 that avoids Zorn’s lemma or the axiom of choice.  \diamond

Now we put this Hahn-Banach theorem to work in the study of duality and transposes.

Exercise 13. Let T: X \to Y be a continuous linear transformation which is bounded from below (i.e. there exists c > 0 such that \|Tx\| \geq c\|x\| for all x \in X); note that this ensures that X is equivalent to some subspace of Y.  Show that the transpose T^*: Y^* \to X^* is surjective. Give an example to show that the claim fails if T is merely assumed to be injective rather than bounded from below.  (Hint: consider the map (a_n)_{n=1}^\infty \to (a_n/n)_{n=1}^\infty on some suitable space of sequences.)   This should be compared with Exercise 11.  \diamond

Exercise 14. Let x be an element of a normed vector space X.  Show that there exists \lambda \in X^* such that \|\lambda\|_{X^*}=1 and \lambda(x) = \|x\|_X.  Conclude in particular that the dual of a non-trivial normed vector space is again non-trivial. \diamond

Given a normed vector space X, we can form its double dual (X^*)^*: the space of continuous linear functionals on X^*.  There is a very natural map \iota: X \to (X^*)^*, defined as

\iota(x) (\lambda) := \lambda(x) (2)

for all x \in X and \lambda \in X^*.  (This map is closely related to the Gelfand transform in the theory of operator algebras.) It is easy to see that \iota is a continuous linear transformation, with operator norm at most 1.  But the Hahn-Banach theorem gives a stronger statement:

Theorem 2. \iota is an isometry.

Proof. We need to show that \|\iota(x)\|_{X^{**}} = \|x\| for all x \in X.  The upper bound is clear; the lower bound follows from Exercise 14. \Box

Exercise 15. Let Y be a subspace of a normed vector space X.  Define the complement Y^\perp of Y to be the space of all \lambda \in X^* which vanish on Y.

  1. Show that Y^\perp is a closed subspace of X^*, and that \overline{Y} := \{ x \in X: \lambda(x)=0 \hbox{ for all } \lambda \in Y^\perp \}; (Compare with Exercise 13 from Notes 5.) In other words, \iota(\overline{Y}) = \iota(X) \cap Y^{\perp \perp}.
  2. Show that Y^\perp is trivial if and only if Y is dense, and Y^\perp = X^* if and only if Y is trivial.
  3. Show that Y^\perp is isomorphic to the dual of the quotient space X/\overline{Y} (which has the norm \|x+\overline{Y}\|_{X/\overline{Y}} := \inf_{y \in \overline{Y}} \|x+y\|_{X}).
  4. Show that Y^* is isomorphic to X^*/Y^\perp. \diamond

From Theorem 2, every normed vector space can be identified with a subspace of its double dual (and every Banach space is identified with a closed subspace of its double dual).  If \iota is surjective, then we have an isomorphism X \equiv X^{**}, and we say that X is reflexive in this case; since X^{**} is a Banach space, we conclude that only Banach spaces can be reflexive.  From linear algebra we see in particular that any finite-dimensional normed vector space is reflexive; from Exercises 7, 8 we see that any Hilbert space and any L^p space with 1 < p < \infty on a \sigma-finite space is also reflexive (and the hypothesis of \sigma-finiteness can in fact be dropped).  On the other hand, from Exercise 6 we see that the Banach space c_0({\Bbb N}) is not reflexive.

An important fact is that l^1({\Bbb N}) is also not reflexive: the dual of l^1({\Bbb N}) is equivalent to l^\infty({\Bbb N}), but the dual of l^\infty({\Bbb N}) is strictly larger than that of l^1({\Bbb N}).   Indeed, consider the subspace c({\Bbb N}) of l^\infty({\Bbb N}) consisting of bounded convergent sequences (equivalently, this is the space spanned by c_0({\Bbb N}) and the constant sequence (1)_{n \in {\Bbb N}}).  The limit functional (a_n)_{n=1}^\infty \mapsto \lim_{n \to \infty} a_n is a bounded linear functional on c({\Bbb N}), with operator norm 1, and thus by the Hahn-Banach theorem can be extended to a generalised limit functional \lambda: l^\infty({\Bbb N}) \to{\Bbb C} which is a continuous linear functional of operator norm 1.  As such generalised limit functionals annihilate all of c_0({\Bbb N}) but are still non-trivial, they do not correspond to any element of \ell^1({\Bbb N}) \equiv c_0({\Bbb N})^*.

Exercise 16. Let \lambda: l^\infty({\Bbb N}) \to {\Bbb C} be a generalised limit functional (i.e. an extension of the limit functional of c({\Bbb N}) of operator norm 1) which is also an algebra homomorphism, i.e. \lambda( (x_n y_n)_{n=1}^\infty ) = \lambda( (x_n)_{n=1}^\infty ) \lambda( (y_n)_{n=1}^\infty ) for all sequences (x_n)_{n=1}^\infty, (y_n)_{n=1}^\infty \in \ell^\infty({\Bbb N}).  Show that there exists a unique non-principal ultrafilter p \in \beta {\Bbb N}\backslash {\Bbb N} (as defined for instance in this blog post) such that \lambda( (x_n)_{n=1}^\infty ) = \lim_{n \to p} x_n for all sequences (x_n)_{n=1}^\infty \in \ell^\infty({\Bbb N}).  Conversely, show that every non-principal ultrafilter generates a generalised limit functional that is also an algebra homomorphism.  (This exercise may become easier once one is acquainted with the Stone-Čech compactification, as discussed for instance in this lecture, and which we will discuss later in this course.  If the algebra homomorphism property is dropped, one has to consider probability measures on the space of non-principal ultrafilters instead.)  \diamond

Exercise 17. Show that any closed subspace of a reflexive space is again reflexive.   Also show that a Banach space X is reflexive if and only if its dual is reflexive.  Conclude that if (X,{\mathcal X}, \mu) is a measure space which contains a countably infinite sequence of disjoint sets of positive finite measure, then L^1(X,{\mathcal X},\mu) and L^\infty(X,{\mathcal X},\mu) are not reflexive.  (Hint: Reduce to the \sigma-finite case.  L^\infty will contain an isometric copy of \ell^\infty({\Bbb N}).) \diamond

Theorem 2 gives a classification of sorts for normed vector spaces:

Corollary 1. Every normed vector space X is isomorphic to a subspace of BC(Y), the space of bounded continuous functions on some bounded complete metric space Y, with the uniform norm.

Proof. Take Y to be the unit ball in X^*, then the map \iota identifies X with a subspace of BC(Y).  \Box

Remark 6. If X is separable, it is known that one can take Y to just be the unit interval [0,1]; this is the Banach-Mazur theorem, which we will not prove here. \diamond

Next, we apply the Hahn-Banach theorem to the transpose operation, improving Exercise 9.

Theorem 3. Let T: X \to Y be a continuous linear transformation between normed vector spaces.  Then \|T^*\|_{op} = \|T\|_{op}; thus the transpose operation is an isometric embedding of B(X \to Y) into B(Y^* \to X^*).

Proof. By Exercise 9, it suffices to show that \|T^*\|_{op} \geq \|T\|_{op}.  Accordingly, let \alpha be any number strictly less than \|T\|_{op}, then we can find x \in X such that \|Tx\|_Y \geq \alpha \|x\|.  By Exercise 14 we can then find \lambda \in Y^* such that \|\lambda\|_{Y^*}=1 and \lambda(Tx) = T^*\lambda(x) = \|Tx\|_Y \geq \alpha \|x\|, and thus \|T^* \lambda \|_{X^*} \geq \alpha.  This implies that \|T^*\|_{op} \geq \alpha; taking suprema over all \alpha strictly less than \|T\|_{op} we obtain the claim.  \Box

If we identify X and Y with subspaces of X^{**} and Y^{**} respectively, we thus see that T^{**}: X^{**} \to Y^{**} is an extension of T: X \to Y with the same operator norm.  In particular, if X and Y are reflexive, we see that T^{**} can be identified with T itself (exactly as in the finite-dimensional linear algebra setting).

— Variants of the Hahn-Banach theorem (optional) —

The Hahn-Banach theorem has a number of essentially equivalent variants, which also are of interest for the geometry of normed vector spaces.

Exercise 18. (generalised Hahn-Banach theorem)  Let Y be a subspace of a real or complex vector space X, let \rho: X \to {\Bbb R} be a sublinear functional on X (thus \rho(cx) = c \rho(x) for all non-negative c and all x \in X, and \rho(x+y) \leq \rho(x)+\rho(y) for all x, y \in X), and let \lambda: Y \to {\Bbb R} be a linear functional on Y such that \lambda(y) \leq \rho(y) for all y \in Y.  Show that \lambda can be extended to a linear functional \tilde \lambda on X such that \tilde \lambda(x) \leq \rho(x) for all x \in X.  Show that this statement implies the usual Hahn-Banach theorem.  (Hint: adapt the proof of the Hahn-Banach theorem.)  \diamond

Call a subset A of a real vector space V algebraically open if the sets \{t: x+tv \in A \} are open in {\Bbb R} for all x,v \in V; note that every open set in a normed vector space is algebraically open.

Theorem 4. (Geometric Hahn-Banach theorem) Let A, B be convex subsets of a real vector space V, with A algebraically open.  Then the following are equivalent:

  1. A and B are disjoint.
  2. There exists a linear functional \lambda: V \to {\Bbb R} and a constant c such that \lambda < c on A, and \lambda \geq c on B. (Equivalently, there is a hyperplane separating A and B, with A avoiding the hyperplane entirely.)

If A and B are convex cones (i.e. tx \in A whenever x \in A and t > 0, and similarly for B), we may take c=0.

Remark 7. In finite dimensions, it is not difficult to drop the algebraic openness hypothesis on A as long as one now replaces the condition \lambda < c by \lambda \leq c.  However in infinite dimensions one cannot do this.  Indeed, if we take V = c_c({\Bbb N}), let A be the set of sequences whose last non-zero element is strictly positive, and B = -A consist of those sequences whose last non-zero element is strictly negative, then one can verify that there is no hyperplane separating A from B\diamond

Proof. Clearly 2 implies 1; now we show that 1 implies 2.  We first handle the case when A and B are convex cones.

Define a good pair to be a pair (A,B) where A and B are disjoint convex cones, with A algebraically open, thus (A,B) is a good pair by hypothesis.  We can order (A,B) \leq (A',B') if A’ contains A and B’ contains B.  A standard application of Zorn’s lemma reveals that any good pair (A,B) is contained in a maximal good pair, and so without loss of generality we may assume that (A,B) is a maximal good pair.

We can of course assume that neither A nor B is empty.
We now claim that B is the complement of A.  For if not, then there exists v \in V which does not lie in either A or B.  By the maximality of (A,B), the convex cone generated by B \cup \{v\} must intersect A at some point, say w.  By dilating w if necessary we may assume that w lies on a line segment between v and some point b in B.  By using the convexity and disjointness of A and B one can then deduce that for any a \in A, the ray \{a + t(w-b): t > 0 \} is disjoint from B.  Thus one can enlarge A to the convex cone generated by A and w-b, which is still algebraically open and now strictly larger than A (because it contains v), a contradiction.  Thus B is the complement of A.

Let us call a line in V \emph{monochromatic} if it is entirely contained in A or entirely contained in B.  Note that if a line is not monochromatic, then (because A and B are convex and partition the line, and A is algebraically open), the line splits into an open ray contained in A, and a closed ray contained in B.  From this we can conclude that if a line is monochromatic, then all parallel lines must also be monochromatic, because otherwise we look at the ray in the parallel line which contains A and use convexity of both A and B to show that this ray is adjacent to a halfplane contained in B, contradicting algebraic openness.  Now let W be the space of all vectors w for which there exists a monochromatic line in the direction w (including 0).  Then W is easily seen to be a vector space;
since A,B are non-empty, W is a proper subspace of V.  On the other hand, if w and w' are not in W, some playing around with the property that A and B are convex sets partitioning V shows that the plane spanned by w and w' contains a monochromatic line, and hence some non-trivial linear combination of w and w' lies in W.  Thus V/W is precisely one-dimensional.  Since every line with direction in w is monochromatic, A and B also have well-defined quotients A/W and B/W on this one-dimensional subspace, which remain convex (with A/W still algebraically open).  But then it is clear that A/W and B/W are an open and closed ray from the origin in V/W respectively.  It is then a routine matter to construct a linear functional \lambda: V \to {\Bbb R} (with null space W) such that A = \{\lambda  0, x \in A \}, B' := \{ (t,tx): t > 0, x \in B \}; we leave the verification that this works as an exercise. \Box

Exercise 19. Use the geometric Hahn-Banach theorem to reprove Exercise 18, thus providing a slightly different proof of the Hahn-Banach theorem.  (It is possible to reverse these implications and deduce the geometric Hahn-Banach theorem from the usual Hahn-Banach theorem, but this is somewhat trickier, requiring one to fashion a norm out of the difference A-B of two convex cones.)  \diamond

Exercise 20. (Algebraic Hahn-Banach theorem) Let V be a vector space over a field F, let W be a subspace of V, and let \lambda: W \to F be a linear map.  Show that there exists a linear map \tilde \lambda: V \to F which extends \lambda\diamond

Some further discussion of variants of the Hahn-Banach theorem (in the finite-dimensional setting) can be found at this blog post of mine.