You are currently browsing the tag archive for the ‘inverse function theorem’ tag.

The classical inverse function theorem reads as follows:

Theorem 1 (${C^1}$ inverse function theorem) Let ${\Omega \subset {\bf R}^n}$ be an open set, and let ${f: \Omega \rightarrow {\bf R}^n}$ be an continuously differentiable function, such that for every ${x_0 \in \Omega}$, the derivative map ${Df(x_0): {\bf R}^n \rightarrow {\bf R}^n}$ is invertible. Then ${f}$ is a local homeomorphism; thus, for every ${x_0 \in \Omega}$, there exists an open neighbourhood ${U}$ of ${x_0}$ and an open neighbourhood ${V}$ of ${f(x_0)}$ such that ${f}$ is a homeomorphism from ${U}$ to ${V}$.

It is also not difficult to show by inverting the Taylor expansion

$\displaystyle f(x) = f(x_0) + Df(x_0)(x-x_0) + o(\|x-x_0\|)$

that at each ${x_0}$, the local inverses ${f^{-1}: V \rightarrow U}$ are also differentiable at ${f(x_0)}$ with derivative

$\displaystyle Df^{-1}(f(x_0)) = Df(x_0)^{-1}. \ \ \ \ \ (1)$

The textbook proof of the inverse function theorem proceeds by an application of the contraction mapping theorem. Indeed, one may normalise ${x_0=f(x_0)=0}$ and ${Df(0)}$ to be the identity map; continuity of ${Df}$ then shows that ${Df(x)}$ is close to the identity for small ${x}$, which may be used (in conjunction with the fundamental theorem of calculus) to make ${x \mapsto x-f(x)+y}$ a contraction on a small ball around the origin for small ${y}$, at which point the contraction mapping theorem readily finishes off the problem.

I recently learned (after I asked this question on Math Overflow) that the hypothesis of continuous differentiability may be relaxed to just everywhere differentiability:

Theorem 2 (Everywhere differentiable inverse function theorem) Let ${\Omega \subset {\bf R}^n}$ be an open set, and let ${f: \Omega \rightarrow {\bf R}^n}$ be an everywhere differentiable function, such that for every ${x_0 \in \Omega}$, the derivative map ${Df(x_0): {\bf R}^n \rightarrow {\bf R}^n}$ is invertible. Then ${f}$ is a local homeomorphism; thus, for every ${x_0 \in \Omega}$, there exists an open neighbourhood ${U}$ of ${x_0}$ and an open neighbourhood ${V}$ of ${f(x_0)}$ such that ${f}$ is a homeomorphism from ${U}$ to ${V}$.

As before, one can recover the differentiability of the local inverses, with the derivative of the inverse given by the usual formula (1).

This result implicitly follows from the more general results of Cernavskii about the structure of finite-to-one open and closed maps, however the arguments there are somewhat complicated (and subsequent proofs of those results, such as the one by Vaisala, use some powerful tools from algebraic geometry, such as dimension theory). There is however a more elementary proof of Saint Raymond that was pointed out to me by Julien Melleray. It only uses basic point-set topology (for instance, the concept of a connected component) and the basic topological and geometric structure of Euclidean space (in particular relying primarily on local compactness, local connectedness, and local convexity). I decided to present (an arrangement of) Saint Raymond’s proof here.

To obtain a local homeomorphism near ${x_0}$, there are basically two things to show: local surjectivity near ${x_0}$ (thus, for ${y}$ near ${f(x_0)}$, one can solve ${f(x)=y}$ for some ${x}$ near ${x_0}$) and local injectivity near ${x_0}$ (thus, for distinct ${x_1, x_2}$ near ${f(x_0)}$, ${f(x_1)}$ is not equal to ${f(x_2)}$). Local surjectivity is relatively easy; basically, the standard proof of the inverse function theorem works here, after replacing the contraction mapping theorem (which is no longer available due to the possibly discontinuous nature of ${Df}$) with the Brouwer fixed point theorem instead (or one could also use degree theory, which is more or less an equivalent approach). The difficulty is local injectivity – one needs to preclude the existence of nearby points ${x_1, x_2}$ with ${f(x_1) = f(x_2) = y}$; note that in contrast to the contraction mapping theorem that provides both existence and uniqueness of fixed points, the Brouwer fixed point theorem only gives existence and not uniqueness.

In one dimension ${n=1}$ one can proceed by using Rolle’s theorem. Indeed, as one traverses the interval from ${x_1}$ to ${x_2}$, one must encounter some intermediate point ${x_*}$ which maximises the quantity ${|f(x_*)-y|}$, and which is thus instantaneously non-increasing both to the left and to the right of ${x_*}$. But, by hypothesis, ${f'(x_*)}$ is non-zero, and this easily leads to a contradiction.

Saint Raymond’s argument for the higher dimensional case proceeds in a broadly similar way. Starting with two nearby points ${x_1, x_2}$ with ${f(x_1)=f(x_2)=y}$, one finds a point ${x_*}$ which “locally extremises” ${\|f(x_*)-y\|}$ in the following sense: ${\|f(x_*)-y\|}$ is equal to some ${r_*>0}$, but ${x_*}$ is adherent to at least two distinct connected components ${U_1, U_2}$ of the set ${U = \{ x: \|f(x)-y\| < r_* \}}$. (This is an oversimplification, as one has to restrict the available points ${x}$ in ${U}$ to a suitably small compact set, but let us ignore this technicality for now.) Note from the non-degenerate nature of ${Df(x_*)}$ that ${x_*}$ was already adherent to ${U}$; the point is that ${x_*}$ “disconnects” ${U}$ in some sense. Very roughly speaking, the way such a critical point ${x_*}$ is found is to look at the sets ${\{ x: \|f(x)-y\| \leq r \}}$ as ${r}$ shrinks from a large initial value down to zero, and one finds the first value of ${r_*}$ below which this set disconnects ${x_1}$ from ${x_2}$. (Morally, one is performing some sort of Morse theory here on the function ${x \mapsto \|f(x)-y\|}$, though this function does not have anywhere near enough regularity for classical Morse theory to apply.)

The point ${x_*}$ is mapped to a point ${f(x_*)}$ on the boundary ${\partial B(y,r_*)}$ of the ball ${B(y,r_*)}$, while the components ${U_1, U_2}$ are mapped to the interior of this ball. By using a continuity argument, one can show (again very roughly speaking) that ${f(U_1)}$ must contain a “hemispherical” neighbourhood ${\{ z \in B(y,r_*): \|z-f(x_*)\| < \kappa \}}$ of ${f(x_*)}$ inside ${B(y,r_*)}$, and similarly for ${f(U_2)}$. But then from differentiability of ${f}$ at ${x_*}$, one can then show that ${U_1}$ and ${U_2}$ overlap near ${x_*}$, giving a contradiction.

The rigorous details of the proof are provided below the fold.