A few days ago, I found myself needing to use the Fredholm alternative in functional analysis:

Theorem 1 (Fredholm alternative) Let {X} be a Banach space, let {T: X \rightarrow X} be a compact operator, and let {\lambda \in {\bf C}} be non-zero. Then exactly one of the following statements hold:

  • (Eigenvalue) There is a non-trivial solution {x \in X} to the equation {Tx = \lambda x}.
  • (Bounded resolvent) The operator {T-\lambda} has a bounded inverse {(T-\lambda)^{-1}} on {X}.

Among other things, the Fredholm alternative can be used to establish the spectral theorem for compact operators. A hypothesis such as compactness is necessary; the shift operator {U} on {\ell^2({\bf Z})}, for instance, has no eigenfunctions, but {U-z} is not invertible for any unit complex number {z}. The claim is also false when {\lambda=0}; consider for instance the multiplication operator {Tf(n) := \frac{1}{n} f(n)} on {\ell^2({\bf N})}, which is compact and has no eigenvalue at zero, but is not invertible.

It had been a while since I had studied the spectral theory of compact operators, and I found that I could not immediately reconstruct a proof of the Fredholm alternative from first principles. So I set myself the exercise of doing so. I thought that I had managed to establish the alternative in all cases, but as pointed out in comments, my argument is restricted to the case where the compact operator {T} is approximable, which means that it is the limit of finite rank operators in the uniform topology. Many Banach spaces (and in particular, all Hilbert spaces) have the approximation property that implies (by a result of Grothendieck) that all compact operators on that space are almost finite rank. For instance, if {X} is a Hilbert space, then any compact operator is approximable, because any compact set can be approximated by a finite-dimensional subspace, and in a Hilbert space, the orthogonal projection operator to a subspace is always a contraction. (In more general Banach spaces, finite-dimensional subspaces are still complemented, but the operator norm of the projection can be large.) Unfortunately, there are examples of Banach spaces for which the approximation property fails; the first such examples were discovered by Enflo, and a subsequent paper of by Alexander demonstrated the existence of compact operators in certain Banach spaces that are not approximable.

I also found out that this argument was essentially also discovered independently by by MacCluer-Hull and by Uuye. Nevertheless, I am recording this argument here, together with two more traditional proofs of the Fredholm alternative (based on the Riesz lemma and a continuity argument respectively).

— 1. First proof (approximable case only) —

In the finite-dimensional case, the Fredholm alternative is an immediate consequence of the rank-nullity theorem, and the finite rank case can be easily deduced from the finite dimensional case by some routine algebraic manipulation. The main difficulty in proving the alternative is to be able to take limits and deduce the approximable case from the finite rank case. The key idea of the proof is to use compactness to establish a lower bound on {T-\lambda I} that is stable enough to allow one to take such limits. There is an additional subtlety (pointed out in comments) that when {X} is not a Hilbert space, it is not necessarily the case that {T} can be approximated by finite rank operators; but a modification of the argument still suffices in this case.

Fix a non-zero {\lambda}. It is clear that {T} cannot have both an eigenvalue and bounded resolvent at {\lambda}, so now suppose that {T} has no eigenvalue at {\lambda}, thus {T-\lambda} is injective. We claim that this implies a lower bound:

Lemma 2 (Lower bound) Let {\lambda \in {\bf C}} be non-zero, and suppose that {T: X \rightarrow X} be a compact operator that has no eigenvalue at {\lambda}. Then there exists {c>0} such that {\|(T-\lambda) x \| \geq c \|x\|} for all {x \in X}.

Proof: By homogeneity, it suffices to establish the claim for unit vectors {x}. Suppose this is not the case; then we can find a sequence of unit vectors {x_n} such that {(T-\lambda) x_n} converges strongly to zero. Since {\lambda x_n} has norm bounded away from zero (here we use the non-zero nature of {\lambda}), we conclude in particular that {y_n := Tx_n} has norm bounded away from zero for sufficiently large {n}. By compactness of {T}, we may (after passing to a subsequence) assume that the {y_n} converge strongly to a limit {y}, which is thus also non-zero.

On the other hand, applying the bounded operator {T} to the strong convergence {(T-\lambda) x_n \rightarrow 0} (and using the fact that {T} commutes with {T-\lambda}) we see that {(T-\lambda) y_n} converges strongly to {0}. Since {y_n} converges strongly to {y}, we conclude that {(T-\lambda) y = 0}, and thus we have an eigenvalue of {T} at {\lambda}, contradiction. \Box

Remark 1 Note that this argument is ineffective in that it provides no explicit value of {c} (and thus no explicit upper bound for the operator norm of the resolvent {(T-\lambda)^{-1}}). This is not surprising, given that the fact that {T} has no eigenvalue at {\lambda} is an open condition rather than a closed one, and so one does not expect bounds that utilise this condition to be uniform. (Indeed, the resolvent needs to blow up as one approaches the spectrum of {T}.)

From the lower bound, we see that to prove the bounded invertibility of {T-\lambda}, it will suffice to establish surjectivity. (Of course, we could have also obtained this reduction by using the open mapping theorem.) In other words, we need to establish that the range {\hbox{Ran}(T-\lambda)} of {T-\lambda} is all of {X}.

Let us first deal with the easy case when {T} has finite rank, so that {\hbox{Ran}(T)} is some finite-dimension {n}. This implies that the kernel {Ker(T)} has codimension {n}, and we may thus split {X = Ker(T) + Y} for some {n}-dimensional space {Y}. The operator {T-\lambda} is a non-zero multiple of the identity on {Ker(T)}, and so {\hbox{Ran}(T-\lambda)} already contains {Ker(T)}. On the other hand, the operator {T(T-\lambda)} maps the {n}-dimensional space {Y} to the {n}-dimensional space {\hbox{Ran}(T)} injectively (since {Y} avoids {Ker(T)} and {T-\lambda} is injective), and thus also surjectively (by the rank-nullity theorem). Thus {T(\hbox{Ran}(T-\lambda))} contains {\hbox{Ran}(T)}, and thus (by the short exact sequence {0 \rightarrow Ker(T) \rightarrow X \rightarrow \hbox{Ran}(T) \rightarrow 0}) {\hbox{Ran}(T-\lambda)} is in fact all of {X}, as desired.

Finally, we deal with the case when {T} is approximable. The lower bound in Lemma 2 is stable, and will extend to the finite rank operators {S_n} for {n} large enough (after reducing {c} slightly). By the preceding discussion for the finite rank case, we see that {\hbox{Ran}(S_n-\lambda)} is all of {X}. Using Lemma 2 for {S_n}, and the convergence of {S_n} to {T} in the operator norm topology, we conclude that {\hbox{Ran}(T-\lambda)} is dense in {X}. On the other hand, we observe that the space {\hbox{Ran}(T-\lambda)} is necessarily closed, for if {(T-\lambda) x_n} converges to a limit {y}, then (by Lemma 2 and the assumption that {X} is Banach) {x_n} will also converge to some limit {x}, and so {y = (T-\lambda) x}. As {\hbox{Ran}(T-\lambda)} is now both dense and closed, it must be all of {X}, and the claim follows.

— 2. Second proof —

We now give the standard proof of the Fredholm alternative based on the Riesz lemma:

Lemma 3 (Riesz lemma) If {Y} is a proper closed subspace of a Banach space {X}, and {\epsilon > 0}, then there exists a unit vector {x} whose distance {\hbox{dist}(x,Y)} to {Y} is at least {1-\epsilon}.

Proof: By the Hahn-Banach theorem, one can find a non-trivial linear functional {\phi: X \rightarrow {\bf C}} on {X} which vanishes on {Y}. By definition of the operator norm {\|\phi\|_{op}} of {\phi}, one can find a unit vector {x} such that {|\phi(x)| \geq (1-\epsilon) \|\phi\|_{op}}. The claim follows. \Box

The strategy here is not to use finite rank approximations (as they are no longer available), but instead to try to contradict the compactness of {T} by exhibiting a bounded set whose image under {T} is not totally bounded.

Let {T: X \rightarrow X} be a compact operator on a Banach space, and let {\lambda} be a non-zero complex number such that {T} has no eigenvalue at {\lambda}. As in the first proof, we have the lower bound from Lemma 2, and we know that {\hbox{Ran}(T-\lambda)} is a closed subspace of {X}; in particular, the map {T-\lambda} is a Banach space isomorphism from {X} to {\hbox{Ran}(T-\lambda)}. Our objective is again to show that {\hbox{Ran}(T-\lambda)} is all of {X}.

Suppose for contradiction that {\hbox{Ran}(T-\lambda)} is a proper closed subspace of {X}. Applying the Banach space isomorphism {T-\lambda} repeatedly, we conclude that for every natural number {m}, the space {V_{m+1} := \hbox{Ran}((T-\lambda)^{m+1})} is a proper closed subspace of {V_{m} := \hbox{Ran}((T-\lambda)^{m})}. From the Riesz lemma, we may thus find unit vectors {x_{m}} in {V_{m}} for {m=0,1,2,\ldots} whose distance to {V_{m+1}} is at least {1/2} (say).

Now suppose that {n > m \geq 0}. By construction, {x_n, (T-\lambda) x_n, (T-\lambda) x_m} all lie in {V_{m+1}}, and thus {T x_n - T x_m \in \lambda x_m + V_{m+1}}. Since {x_m} lies at a distance at least {1/2} from {V_{m+1}}, we conclude the separation proeprty

\displaystyle  \| T x_n - T x_m \| \geq \frac{|\lambda|}{2}.

But this implies that the sequence {\{ T x_n: n \in {\bf N} \}} is not totally bounded, contradicting the compactness of {T}.

— 3. Third proof —

Now we give another textbook proof of the Fredholm alternative, based on Fredholm index theory. The basic idea is to observe that the Fredholm alternative is easy when {\lambda} is large enough (and specifically, when {|\lambda| > \|T\|_{op}}), as one can then invert {T-\lambda} using Neumann series. One can then attempt to continously pertrb {\lambda} from large values to small values, using stability results (such as Lemma 2) to ensure that invertibility does not suddenly get destroyed during this process. Unfortunately, there is an obstruction to this strategy, which is that during the perturbation process, {\lambda} may pass through an eigenvalue of {T}. To get around this, we will need to abandon the hypothesis that {T} has no eigenvalue at {\lambda}, and work in the more general setting in which {\hbox{ker}(T-\lambda)} is allowed to be non-trivial. This leads to a lengthier proof, but one which lays the foundation for much of Fredholm theory (which is more powerful than the Fredholm alternative alone).

Fortunately, we still have analogues of much of the above theory in this setting:

Proposition 4 Let {\lambda \in {\bf C}} be non-zero, and let {T: X \rightarrow X} be a compact operator on a Banach space {X}. Then the following statements hold;

  1. (Finite multiplicity) {\hbox{ker}(T-\lambda)} is finite-dimensional.
  2. (Lower bound) There exists {c>0} such that {\|Tx\| \geq c \hbox{dist}(x, \hbox{ker}(T-\lambda))} for all {x \in X}.
  3. (Closure) {\hbox{Ran}(T-\lambda)} is a closed subspace of {X}.
  4. (Finite comultiplicity) {\hbox{Ran}(T-\lambda)} has finite codimension in {X}.

Proof: We begin with finite multiplicity. Suppose for contradiction that {\hbox{ker}(T-\lambda)} was infinite dimensional, then it must contain an infinite nested sequence {\{0\} = V_0 \subsetneq V_1 \subsetneq V_2 \subsetneq \ldots} of finite-dimensional (and thus closed) subspaces. Applying the Riesz lemma, we may find for each {n=1,2,\ldots}, a unit vector {x_n \in V_n} of distance at least {1/2} from {V_{n-1}}. Since {T x_n = \lambda x_n}, we see that the sequence {\{ T x_n: n=1,2,\ldots\}} is then {|\lambda|/2}-separated and thus not totally bounded, contradicting the compactness of {T}.

The lower bound follows from the argument used to prove Lemma 2 after quotienting out the finite-dimensional space {\hbox{ker}(T-\lambda)}, and the closure assertion follows from the lower bound (again after quotienting out the kernel) as before.

Finally, we establish finite comultiplicity. Suppose for contradiction that the closed subspace {\hbox{Ran}(T-\lambda)} had infinite codimension, then by properties of {T-\lambda} already established, we see that {\hbox{Ran}((T-\lambda)^{m+1})} is closed and has infinite codimension in {\hbox{Ran}((T-\lambda)^{m})} for each {m}. One can then argue as in the second proof to contradict total boundedness as before. \Box

Remark 2 The above arguments also work if {\lambda} is replaced by an invertible linear operator on {X}, or more generally by a Fredholm operator.

We can now define the index {\hbox{ind}(T-\lambda)} to be the dimension of the kernel of {T-\lambda}, minus the codimension of the range. To establish the Fredholm alternative, it suffices to show that {\hbox{ind}(T-\lambda)=0} for all {\lambda}, as this implies surjectivity of {T-\lambda} whenever there is no eigenvalue. Note that Note that when {\lambda} is sufficiently large, and in particular when {|\lambda| > \|T\|_{op}}, then {T-\lambda} is invertible by Neumann series and so one already has index zero in this case. To finish the proof, it suffices by the discrete nature of the index function (which takes values in the integers) to establish continuity of the index:

Lemma 5 (Continuity of index) Let {T: X \rightarrow X} be a compact operator on a Banach space. Then the function {\lambda \mapsto \hbox{ind}(T-\lambda)} is continuous from {{\bf C} \backslash \{0\}} to {{\bf Z}}.

Proof: Let {\lambda} be non-zero. Our task is to show that

\displaystyle  \hbox{ind} (T-\lambda') = \hbox{ind} (T - \lambda)

for all {\lambda'} sufficiently close to {\lambda}.

In the model case when {T-\lambda} is invertible (and thus has index zero), the claim is easy, because {(T-\lambda') (T-\lambda)^{-1} = 1 + (\lambda-\lambda') (T-\lambda)^{-1}} can be inverted by Neumann series for {\lambda'} close enough to {\lambda}, giving rise to the invertibility of {T-\lambda}.

Now we handle the general case. As every finite dimensional space is complemented, we can split {X = \hbox{ker}(T-\lambda) + V} for some closed subspace {V} of {X}, and similarly split {X = \hbox{Ran}(T-\lambda) + W} for some finite-dimensional subspace {W} of {X} with dimension {\hbox{codim} \hbox{Ran}(T-\lambda)}.

From the lower bound we see that {T-\lambda} is a Banach space isomorphism from {V} to {\hbox{Ran}(T-\lambda)}. For {\lambda'} close to {\lambda}, we thus see that {(T-\lambda')(V)} is close to {\hbox{Ran}(T-\lambda)}, in the sense that one can map the latter space to the former by a small perturbation of the identity (in the operator norm). Since {W} complements {\hbox{Ran}(T-\lambda)}, it also complements {(T-\lambda')(V)} for {\lambda'} sufficiently close to {\lambda}. (To see this, observe that the composition of the obvious maps

\displaystyle  X \mapsto W \times \hbox{Ran}(T-\lambda) \rightarrow W \times V \rightarrow W \times (T-\lambda')(V) \rightarrow X

is a small perturbation of the identity map and is thus invertible for {\lambda'} close to {\lambda}.)

Let {\pi: X \rightarrow W} be the projection onto {W} with kernel {(T-\lambda')(V)}. Then {\pi (T-\lambda')} maps the finite-dimensional space {\hbox{ker}(T-\lambda)} to the finite-dimensional space {W}. By the rank-nullity theorem, this map has index equal to {\hbox{dim} \hbox{ker}(T-\lambda) - \hbox{dim}(W) = \hbox{ind}(T-\lambda)}. Gluing this with the Banach space isomorphism {T-\lambda': V \rightarrow \hbox{Ran}(T-\lambda')}, we see that {T-\lambda'} also has index {\hbox{ind}(T-\lambda)}, as desired. \Box

Remark 3 Again, this result extends to more general Fredholm operators, with the result being that the index of a Fredholm operator is stable with respect to continuous deformations in the operator norm topology.