You are currently browsing the tag archive for the ‘stability’ tag.
Suppose is a continuous (but nonlinear) map from one normed vector space
to another
. The continuity means, roughly speaking, that if
are such that
is small, then
is also small (though the precise notion of “smallness” may depend on
or
, particularly if
is not known to be uniformly continuous). If
is known to be differentiable (in, say, the Fréchet sense), then we in fact have a linear bound of the form
for some depending on
, if
is small enough; one can of course make
independent of
(and drop the smallness condition) if
is known instead to be Lipschitz continuous.
In many applications in analysis, one would like more explicit and quantitative bounds that estimate quantities like in terms of quantities like
. There are a number of ways to do this. First of all, there is of course the trivial estimate arising from the triangle inequality:
This estimate is usually not very good when and
are close together. However, when
and
are far apart, this estimate can be more or less sharp. For instance, if the magnitude of
varies so much from
to
that
is more than (say) twice that of
, or vice versa, then (1) is sharp up to a multiplicative constant. Also, if
is oscillatory in nature, and the distance between
and
exceeds the “wavelength” of the oscillation of
at
(or at
), then one also typically expects (1) to be close to sharp. Conversely, if
does not vary much in magnitude from
to
, and the distance between
and
is less than the wavelength of any oscillation present in
, one expects to be able to improve upon (1).
When is relatively simple in form, one can sometimes proceed simply by substituting
. For instance, if
is the squaring function
in a commutative ring
, one has
and thus
or in terms of the original variables one has
If the ring is not commutative, one has to modify this to
Thus, for instance, if are
matrices and
denotes the operator norm, one sees from the triangle inequality and the sub-multiplicativity
of operator norm that
If involves
(or various components of
) in several places, one can sometimes get a good estimate by “swapping”
with
at each of the places in turn, using a telescoping series. For instance, if we again use the squaring function
in a non-commutative ring, we have
which for instance leads to a slight improvement of (2):
More generally, for any natural number , one has the identity
in a commutative ring, while in a non-commutative ring one must modify this to
and for matrices one has
Exercise 1 If
and
are unitary
matrices, show that the commutator
obeys the inequality
(Hint: first control
.)
Now suppose (for simplicity) that is a map between Euclidean spaces. If
is continuously differentiable, then one can use the fundamental theorem of calculus to write
where is any continuously differentiable path from
to
. For instance, if one uses the straight line path
, one has
In the one-dimensional case , this simplifies to
Among other things, this immediately implies the factor theorem for functions: if
is a
function for some
that vanishes at some point
, then
factors as the product of
and some
function
. Another basic consequence is that if
is uniformly bounded in magnitude by some constant
, then
is Lipschitz continuous with the same constant
.
Applying (4) to the power function , we obtain the identity
which can be compared with (3). Indeed, for and
close to
, one can use logarithms and Taylor expansion to arrive at the approximation
, so (3) behaves a little like a Riemann sum approximation to (5).
Exercise 2 For each
, let
and
be random variables taking values in a measurable space
, and let
be a bounded measurable function.
- (i) (Lindeberg exchange identity) Show that
- (ii) (Knowles-Yin exchange identity) Show that
where
is a mixture of
and
, with
uniformly drawn from
independently of each other and of the
.
- (iii) Discuss the relationship between the identities in parts (i), (ii) with the identities (3), (5).
(The identity in (i) is the starting point for the Lindeberg exchange method in probability theory, discussed for instance in this previous post. The identity in (ii) can also be used in the Lindeberg exchange method; the terms in the right-hand side are slightly more symmetric in the indices
, which can be a technical advantage in some applications; see this paper of Knowles and Yin for an instance of this.)
Exercise 3 If
is continuously
times differentiable, establish Taylor’s theorem with remainder
If
is bounded, conclude that
For real scalar functions , the average value of the continuous real-valued function
must be attained at some point
in the interval
. We thus conclude the mean-value theorem
for some (that can depend on
,
, and
). This can for instance give a second proof of fact that continuously differentiable functions
with bounded derivative are Lipschitz continuous. However it is worth stressing that the mean-value theorem is only available for real scalar functions; it is false for instance for complex scalar functions. A basic counterexample is given by the function
; there is no
for which
. On the other hand, as
has magnitude
, we still know from (4) that
is Lipschitz of constant
, and when combined with (1) we obtain the basic bounds
which are already very useful for many applications.
Exercise 4 Let
be
matrices, and let
be a non-negative real.
- (i) Establish the Duhamel formula
where
denotes the matrix exponential of
. (Hint: Differentiate
or
in
.)
- (ii) Establish the iterated Duhamel formula
for any
.
- (iii) Establish the infinitely iterated Duhamel formula
- (iv) If
is an
matrix depending in a continuously differentiable fashion on
, establish the variation formula
where
is the adjoint representation
applied to
, and
is the function
(thus
for non-zero
), with
defined using functional calculus.
We remark that further manipulation of (iv) of the above exercise using the fundamental theorem of calculus eventually leads to the Baker-Campbell-Hausdorff-Dynkin formula, as discussed in this previous blog post.
Exercise 5 Let
be positive definite
matrices, and let
be an
matrix. Show that there is a unique solution
to the Sylvester equation
which is given by the formula
In the above examples we had applied the fundamental theorem of calculus along linear curves . However, it is sometimes better to use other curves. For instance, the circular arc
can be useful, particularly if
and
are “orthogonal” or “independent” in some sense; a good example of this is the proof by Maurey and Pisier of the gaussian concentration inequality, given in Theorem 8 of this previous blog post. In a similar vein, if one wishes to compare a scalar random variable
of mean zero and variance one with a Gaussian random variable
of mean zero and variance one, it can be useful to introduce the intermediate random variables
(where
and
are independent); note that these variables have mean zero and variance one, and after coupling them together appropriately they evolve by the Ornstein-Uhlenbeck process, which has many useful properties. For instance, one can use these ideas to establish monotonicity formulae for entropy; see e.g. this paper of Courtade for an example of this and further references. More generally, one can exploit curves
that flow according to some geometrically natural ODE or PDE; several examples of this occur famously in Perelman’s proof of the Poincaré conjecture via Ricci flow, discussed for instance in this previous set of lecture notes.
In some cases, it is difficult to compute or the derivative
directly, but one can instead proceed by implicit differentiation, or some variant thereof. Consider for instance the matrix inversion map
(defined on the open dense subset of
matrices consisting of invertible matrices). If one wants to compute
for
close to
, one can write temporarily write
, thus
Multiplying both sides on the left by to eliminate the
term, and on the right by
to eliminate the
term, one obtains
and thus on reversing these steps we arrive at the basic identity
For instance, if are
matrices, and we consider the resolvents
then we have the resolvent identity
as long as does not lie in the spectrum of
or
(for instance, if
,
are self-adjoint then one can take
to be any strictly complex number). One can iterate this identity to obtain
for any natural number ; in particular, if
has operator norm less than one, one has the Neumann series
Similarly, if is a family of invertible matrices that depends in a continuously differentiable fashion on a time variable
, then by implicitly differentiating the identity
in using the product rule, we obtain
and hence
(this identity may also be easily derived from (6)). One can then use the fundamental theorem of calculus to obtain variants of (6), for instance by using the curve we arrive at
assuming that the curve stays entirely within the set of invertible matrices. While this identity may seem more complicated than (6), it is more symmetric, which conveys some advantages. For instance, using this identity it is easy to see that if are positive definite with
in the sense of positive definite matrices (that is,
is positive definite), then
. (Try to prove this using (6) instead!)
Exercise 6 If
is an invertible
matrix and
are
vectors, establish the Sherman-Morrison formula
whenever
is a scalar such that
is non-zero. (See also this previous blog post for more discussion of these sorts of identities.)
One can use the Cauchy integral formula to extend these identities to other functions of matrices. For instance, if is an entire function, and
is a counterclockwise contour that goes around the spectrum of both
and
, then we have
and similarly
and hence by (7) one has
similarly, if depends on
in a continuously differentiable fashion, then
as long as goes around the spectrum of
.
Exercise 7 If
is an
matrix depending continuously differentiably on
, and
is an entire function, establish the tracial chain rule
In a similar vein, given that the logarithm function is the antiderivative of the reciprocal, one can express the matrix logarithm of a positive definite matrix by the fundamental theorem of calculus identity
(with the constant term needed to prevent a logarithmic divergence in the integral). Differentiating, we see that if
is a family of positive definite matrices depending continuously on
, that
This can be used for instance to show that is a monotone increasing function, in the sense that
whenever
in the sense of positive definite matrices. One can of course integrate this formula to obtain some formulae for the difference
of the logarithm of two positive definite matrices
.
To compare the square root of two positive definite matrices
is trickier; there are multiple ways to proceed. One approach is to use contour integration as before (but one has to take some care to avoid branch cuts of the square root). Another to express the square root in terms of exponentials via the formula
where is the gamma function; this formula can be verified by first diagonalising
to reduce to the scalar case and using the definition of the Gamma function. Then one has
and one can use some of the previous identities to control . This is pretty messy though. A third way to proceed is via implicit differentiation. If for instance
is a family of positive definite matrices depending continuously differentiably on
, we can differentiate the identity
to obtain
This can for instance be solved using Exercise 5 to obtain
and this can in turn be integrated to obtain a formula for . This is again a rather messy formula, but it does at least demonstrate that the square root is a monotone increasing function on positive definite matrices:
implies
.
Several of the above identities for matrices can be (carefully) extended to operators on Hilbert spaces provided that they are sufficiently well behaved (in particular, if they have a good functional calculus, and if various spectral hypotheses are obeyed). We will not attempt to do so here, however.
Last month, at the joint AMS/MAA meeting in San Diego, I spoke at the AMS “Current Events” Bulletin on the topic “Why are solitons stable?“. This talk was supposed to be a survey of many of the developments on the rigorous stability theory of solitary waves in dispersive wave models (e.g. the Kortweg-de Vries equation and its generalisations, nonlinear Schrödinger equations, etc.), although my actual talk (which was the usual 50 minutes in length) only managed to cover about half of the material I had planned.
More recently, I completed the article that accompanies the talk, and which will be submitted to the Bulletin of the American Mathematical Society. In this paper I describe the key conflict in these wave models between dispersion (the tendency of waves of differing frequency to move at different speeds, thus causing any localised wave to disperse in space over time) and nonlinearity (which can cause any concentrated portion of the wave to self-amplify). Solitons seem to lie at the exact balancing point between these two forces, neither dispersing nor amplifying, but instead simply traveling at a constant velocity or oscillating in phase at a constant rate. In some cases, this balancing point is unstable; remove even a tiny amount of mass from the soliton and it eventually disperses completely into radiation, or one can add a tiny amount and cause the soliton to concentrate into a point and thence exhibit blowup in finite time. In other cases, the balancing point is stable; small perturbations to a soliton may end up changing the amplitude, position, and/or velocity of the soliton slightly, but the bulk of the solution still closely resembles a soliton in size, shape, and behaviour. Stability is sometimes enforced by linear properties, such as dispersive estimates or spectral properties of the linearised dynamics, but is also often enforced by nonlinear properties, such as nonlinear conservation laws, monotonicity formulae, and local propagation estimates for mass and energy (such as those provided by virial identities). The interplay between all these properties can be remarkably subtle, especially in the critical case when a key conserved quantity is scale-invariant (thus leading to an additional degeneracy in the soliton manifold). This is particularly evident in the remarkable series of papers by Martel and Merle establishing various stability and blowup properties near the ground state soliton of the critical generalised KdV equation, which I spend some time discussing (without going into too many of the (quite numerous) technical details). The focus in my paper is primarily on the non-integrable case, in which the techniques are primarily analytic rather than algebraic or geometric.
Recent Comments