You are currently browsing the monthly archive for April 2015.
The Euler equations for three-dimensional incompressible inviscid fluid flow are
where is the velocity field, and
is the pressure field. For the purposes of this post, we will ignore all issues of decay or regularity of the fields in question, assuming that they are as smooth and rapidly decreasing as needed to justify all the formal calculations here; in particular, we will apply inverse operators such as
or
formally, assuming that these inverses are well defined on the functions they are applied to.
Meanwhile, the surface quasi-geostrophic (SQG) equation is given by
where is the active scalar, and
is the velocity field. The SQG equations are often used as a toy model for the 3D Euler equations, as they share many of the same features (e.g. vortex stretching); see this paper of Constantin, Majda, and Tabak for more discussion (or this previous blog post).
I recently found a more direct way to connect the two equations. We first recall that the Euler equations can be placed in vorticity-stream form by focusing on the vorticity . Indeed, taking the curl of (1), we obtain the vorticity equation
while the velocity can be recovered from the vorticity via the Biot-Savart law
The system (4), (5) has some features in common with the system (2), (3); in (2) it is a scalar field that is being transported by a divergence-free vector field
, which is a linear function of the scalar field as per (3), whereas in (4) it is a vector field
that is being transported (in the Lie derivative sense) by a divergence-free vector field
, which is a linear function of the vector field as per (5). However, the system (4), (5) is in three dimensions whilst (2), (3) is in two spatial dimensions, the dynamical field is a scalar field
for SQG and a vector field
for Euler, and the relationship between the velocity field and the dynamical field is given by a zeroth order Fourier multiplier in (3) and a
order operator in (5).
However, we can make the two equations more closely resemble each other as follows. We first consider the generalisation
where is an invertible, self-adjoint, positive-definite zeroth order Fourier multiplier that maps divergence-free vector fields to divergence-free vector fields. The Euler equations then correspond to the case when
is the identity operator. As discussed in this previous blog post (which used
to denote the inverse of the operator denoted here as
), this generalised Euler system has many of the same features as the original Euler equation, such as a conserved Hamiltonian
the Kelvin circulation theorem, and conservation of helicity
Also, if we require to be divergence-free at time zero, it remains divergence-free at all later times.
Let us consider “two-and-a-half-dimensional” solutions to the system (6), (7), in which do not depend on the vertical coordinate
, thus
and
but we allow the vertical components to be non-zero. For this to be consistent, we also require
to commute with translations in the
direction. As all derivatives in the
direction now vanish, we can simplify (6) to
where is the two-dimensional material derivative
Also, divergence-free nature of then becomes
In particular, we may (formally, at least) write
for some scalar field , so that (7) becomes
The first two components of (8) become
which rearranges using (9) to
Formally, we may integrate this system to obtain the transport equation
Finally, the last component of (8) is
At this point, we make the following choice for :
where is a real constant and
is the Leray projection onto divergence-free vector fields. One can verify that for large enough
,
is a self-adjoint positive definite zeroth order Fourier multiplier from divergence free vector fields to divergence-free vector fields. With this choice, we see from (10) that
so that (12) simplifies to
This implies (formally at least) that if vanishes at time zero, then it vanishes for all time. Setting
, we then have from (10) that
and from (11) we then recover the SQG system (2), (3). To put it another way, if and
solve the SQG system, then by setting
then solve the modified Euler system (6), (7) with
given by (13).
We have , so the Hamiltonian
for the modified Euler system in this case is formally a scalar multiple of the conserved quantity
. The momentum
for the modified Euler system is formally a scalar multiple of the conserved quantity
, while the vortex stream lines that are preserved by the modified Euler flow become the level sets of the active scalar that are preserved by the SQG flow. On the other hand, the helicity
vanishes, and other conserved quantities for SQG (such as the Hamiltonian
) do not seem to correspond to conserved quantities of the modified Euler system. This is not terribly surprising; a low-dimensional flow may well have a richer family of conservation laws than the higher-dimensional system that it is embedded in.
An extremely large portion of mathematics is concerned with locating solutions to equations such as
for in some suitable domain space (either finite-dimensional or infinite-dimensional), and various maps
or
. To solve the fixed point iteration equation (1), the simplest general method available is the fixed point iteration method: one starts with an initial approximate solution
to (1), so that
, and then recursively constructs the sequence
by
. If
behaves enough like a “contraction”, and the domain is complete, then one can expect the
to converge to a limit
, which should then be a solution to (1). For instance, if
is a map from a metric space
to itself, which is a contraction in the sense that
for all and some
, then with
as above we have
for any , and so the distances
between successive elements of the sequence decay at at least a geometric rate. This leads to the contraction mapping theorem, which has many important consequences, such as the inverse function theorem and the Picard existence theorem.
A slightly more complicated instance of this strategy arises when trying to linearise a complex map defined in a neighbourhood
of a fixed point. For simplicity we normalise the fixed point to be the origin, thus
and
. When studying the complex dynamics
,
,
of such a map, it can be useful to try to conjugate
to another function
, where
is a holomorphic function defined and invertible near
with
, since the dynamics of
will be conjguate to that of
. Note that if
and
, then from the chain rule any conjugate
of
will also have
and
. Thus, the “simplest” function one can hope to conjugate
to is the linear function
. Let us say that
is linearisable (around
) if it is conjugate to
in some neighbourhood of
. Equivalently,
is linearisable if there is a solution to the Schröder equation
for some defined and invertible in a neighbourhood
of
with
, and all
sufficiently close to
. (The Schröder equation is normalised somewhat differently in the literature, but this form is equivalent to the usual form, at least when
is non-zero.) Note that if
solves the above equation, then so does
for any non-zero
, so we may normalise
in addition to
, which also ensures local invertibility from the inverse function theorem. (Note from winding number considerations that
cannot be invertible near zero if
vanishes.)
We have the following basic result of Koenigs:
Theorem 1 (Koenig’s linearisation theorem) Let
be a holomorphic function defined near
with
and
. If
(attracting case) or
(repelling case), then
is linearisable near zero.
Proof: Observe that if solve (2), then
solve (2) also (in a sufficiently small neighbourhood of zero). Thus we may reduce to the attractive case
.
Let be a sufficiently small radius, and let
denote the space of holomorphic functions
on the complex disk
with
and
. We can view the Schröder equation (2) as a fixed point equation
where is the partially defined function on
that maps a function
to the function
defined by
assuming that is well-defined on the range of
(this is why
is only partially defined).
We can solve this equation by the fixed point iteration method, if is small enough. Namely, we start with
being the identity map, and set
, etc. We equip
with the uniform metric
. Observe that if
, and
is small enough, then
takes values in
, and
are well-defined and lie in
. Also, since
is smooth and has derivative
at
, we have
if ,
and
is sufficiently small depending on
. This is not yet enough to establish the required contraction (thanks to Mario Bonk for pointing this out); but observe that the function
is holomorphic on
and bounded by
on the boundary of this ball (or slightly within this boundary), so by the maximum principle we see that
on all of , and in particular
on . Putting all this together, we see that
since , we thus obtain a contraction on the ball
if
is small enough (and
sufficiently small depending on
). From this (and the completeness of
, which follows from Morera’s theorem) we see that the iteration
converges (exponentially fast) to a limit
which is a fixed point of
, and thus solves Schröder’s equation, as required.
Koenig’s linearisation theorem leaves open the indifferent case when . In the rationally indifferent case when
for some natural number
, there is an obvious obstruction to linearisability, namely that
(in particular, linearisation is not possible in this case when
is a non-trivial rational function). An obstruction is also present in some irrationally indifferent cases (where
but
for any natural number
), if
is sufficiently close to various roots of unity; the first result of this form is due to Cremer, and the optimal result of this type for quadratic maps was established by Yoccoz. In the other direction, we have the following result of Siegel:
Theorem 2 (Siegel’s linearisation theorem) Let
be a holomorphic function defined near
with
and
. If
and one has the Diophantine condition
for all natural numbers
and some constant
, then
is linearisable at
.
The Diophantine condition can be relaxed to a more general condition involving the rational exponents of the phase of
; this was worked out by Brjuno, with the condition matching the one later obtained by Yoccoz. Amusingly, while the set of Diophantine numbers (and hence the set of linearisable
) has full measure on the unit circle, the set of non-linearisable
is generic (the complement of countably many nowhere dense sets) due to the above-mentioned work of Cremer, leading to a striking disparity between the measure-theoretic and category notions of “largeness”.
Siegel’s theorem does not seem to be provable using a fixed point iteration method. However, it can be established by modifying another basic method to solve equations, namely Newton’s method. Let us first review how this method works to solve the equation for some smooth function
defined on an interval
. We suppose we have some initial approximant
to this equation, with
small but not necessarily zero. To make the analysis more quantitative, let us suppose that the interval
lies in
for some
, and we have the estimates
for some and
and all
(the factors of
are present to make
“dimensionless”).
Lemma 3 Under the above hypotheses, we can find
with
such that
In particular, setting
,
, and
, we have
, and
for all
.
The crucial point here is that the new error is roughly the square of the previous error
. This leads to extremely fast (double-exponential) improvement in the error upon iteration, which is more than enough to absorb the exponential losses coming from the
factor.
Proof: If for some absolute constants
then we may simply take
, so we may assume that
for some small
and large
. Using the Newton approximation
we are led to the choice
for . From the hypotheses on
and the smallness hypothesis on
we certainly have
. From Taylor’s theorem with remainder we have
and the claim follows.
We can iterate this procedure; starting with as above, we obtain a sequence of nested intervals
with
, and with
evolving by the recursive equations and estimates
If is sufficiently small depending on
, we see that
converges rapidly to zero (indeed, we can inductively obtain a bound of the form
for some large absolute constant
if
is small enough), and
converges to a limit
which then solves the equation
by the continuity of
.
As I recently learned from Zhiqiang Li, a similar scheme works to prove Siegel’s theorem, as can be found for instance in this text of Carleson and Gamelin. The key is the following analogue of Lemma 3.
Lemma 4 Let
be a complex number with
and
for all natural numbers
. Let
, and let
be a holomorphic function with
,
, and
for all
and some
. Let
, and set
. Then there exists an injective holomorphic function
and a holomorphic function
such that
for all
, and such that
and
for all
and some
.
Proof: By scaling we may normalise . If
for some constants
, then we can simply take
to be the identity and
, so we may assume that
for some small
and large
.
To motivate the choice of , we write
and
, with
and
viewed as small. We would like to have
, which expands as
As and
are both small, we can heuristically approximate
up to quadratic errors (compare with the Newton approximation
), and arrive at the equation
This equation can be solved by Taylor series; the function vanishes to second order at the origin and thus has a Taylor expansion
and then has a Taylor expansion
We take this as our definition of , define
, and then define
implicitly via (4).
Let us now justify that this choice works. By (3) and the generalised Cauchy integral formula, we have for all
; by the Diophantine assumption on
, we thus have
. In particular,
converges on
, and on the disk
(say) we have the bounds
In particular, as is so small, we see that
maps
injectively to
and
to
, and the inverse
maps
to
. From (3) we see that
maps
to
, and so if we set
to be the function
, then
is a holomorphic function obeying (4). Expanding (4) in terms of
and
as before, and also writing
, we have
for , which by (5) simplifies to
From (6), the fundamental theorem of calculus, and the smallness of we have
and thus
From (3) and the Cauchy integral formula we have on (say)
, and so from (6) and the fundamental theorem of calculus we conclude that
on , and the claim follows.
If we set ,
, and
to be sufficiently small, then (since
vanishes to second order at the origin), the hypotheses of this lemma will be obeyed for some sufficiently small
. Iterating the lemma (and halving
repeatedly), we can then find sequences
, injective holomorphic functions
and holomorphic functions
such that one has the recursive identities and estimates
for all and
. By construction,
decreases to a positive radius
that is a constant multiple of
, while (for
small enough)
converges double-exponentially to zero, so in particular
converges uniformly to
on
. Also,
is close enough to the identity, the compositions
are uniformly convergent on
with
and
. From this we have
on , and on taking limits using Morera’s theorem we obtain a holomorphic function
defined near
with
,
, and
obtaining the required linearisation.
Remark 5 The idea of using a Newton-type method to obtain error terms that decay double-exponentially, and can therefore absorb exponential losses in the iteration, also occurs in KAM theory and in Nash-Moser iteration, presumably due to Siegel’s influence on Moser. (I discuss Nash-Moser iteration in this note that I wrote back in 2006.)
The von Neumann ergodic theorem (the Hilbert space version of the mean ergodic theorem) asserts that if is a unitary operator on a Hilbert space
, and
is a vector in that Hilbert space, then one has
in the strong topology, where is the
-invariant subspace of
, and
is the orthogonal projection to
. (See e.g. these previous lecture notes for a proof.) The same proof extends to more general amenable groups: if
is a countable amenable group acting on a Hilbert space
by unitary transformations
for
, and
is a vector in that Hilbert space, then one has
for any Folner sequence of
, where
is the
-invariant subspace, and
is the average of
on
. Thus one can interpret
as a certain average of elements of the orbit
of
.
In a previous blog post, I noted a variant of this ergodic theorem (due to Alaoglu and Birkhoff) that holds even when the group is not amenable (or not discrete), using a more abstract notion of averaging:
Theorem 1 (Abstract ergodic theorem) Let
be an arbitrary group acting unitarily on a Hilbert space
, and let
be a vector in
. Then
is the element in the closed convex hull of
of minimal norm, and is also the unique element of
in this closed convex hull.
I recently stumbled upon a different way to think about this theorem, in the additive case when
is abelian, which has a closer resemblance to the classical mean ergodic theorem. Given an arbitrary additive group
(not necessarily discrete, or countable), let
denote the collection of finite non-empty multisets in
– that is to say, unordered collections
of elements
of
, not necessarily distinct, for some positive integer
. Given two multisets
,
in
, we can form the sum set
. Note that the sum set
can contain multiplicity even when
do not; for instance,
. Given a multiset
in
, and a function
from
to a vector space
, we define the average
as
Note that the multiplicity function of the set affects the average; for instance, we have
, but
.
We can define a directed set on as follows: given two multisets
, we write
if we have
for some
. Thus for instance we have
. It is easy to verify that this operation is transitive and reflexive, and is directed because any two elements
of
have a common upper bound, namely
. (This is where we need
to be abelian.) The notion of convergence along a net, now allows us to define the notion of convergence along
; given a family
of points in a topological space
indexed by elements
of
, and a point
in
, we say that
converges to
along
if, for every open neighbourhood
of
in
, one has
for sufficiently large
, that is to say there exists
such that
for all
. If the topological space
is Hausdorff, then the limit
is unique (if it exists), and we then write
When takes values in the reals, one can also define the limit superior or limit inferior along such nets in the obvious fashion.
We can then give an alternate formulation of the abstract ergodic theorem in the abelian case:
Theorem 2 (Abelian abstract ergodic theorem) Let
be an arbitrary additive group acting unitarily on a Hilbert space
, and let
be a vector in
. Then we have
in the strong topology of
.
Proof: Suppose that , so that
for some
, then
so by unitarity and the triangle inequality we have
thus is monotone non-increasing in
. Since this quantity is bounded between
and
, we conclude that the limit
exists. Thus, for any
, we have for sufficiently large
that
for all . In particular, for any
, we have
We can write
and so from the parallelogram law and unitarity we have
for all , and hence by the triangle inequality (averaging
over a finite multiset
)
for any . This shows that
is a Cauchy sequence in
(in the strong topology), and hence (by the completeness of
) tends to a limit. Shifting
by a group element
, we have
and hence is invariant under shifts, and thus lies in
. On the other hand, for any
and
, we have
and thus on taking strong limits
and so is orthogonal to
. Combining these two facts we see that
is equal to
as claimed.
To relate this result to the classical ergodic theorem, we observe
Lemma 3 Let
be a countable additive group, with a F{\o}lner sequence
, and let
be a bounded sequence in a normed vector space indexed by
. If
exists, then
exists, and the two limits are equal.
Proof: From the F{\o}lner property, we see that for any and any
, the averages
and
differ by at most
in norm if
is sufficiently large depending on
,
(and the
). On the other hand, by the existence of the limit
, the averages
and
differ by at most
in norm if
is sufficiently large depending on
(regardless of how large
is). The claim follows.
It turns out that this approach can also be used as an alternate way to construct the Gowers–Host-Kra seminorms in ergodic theory, which has the feature that it does not explicitly require any amenability on the group (or separability on the underlying measure space), though, as pointed out to me in comments, even uncountable abelian groups are amenable in the sense of possessing an invariant mean, even if they do not have a F{\o}lner sequence.
Given an arbitrary additive group , define a
-system
to be a probability space
(not necessarily separable or standard Borel), together with a collection
of invertible, measure-preserving maps, such that
is the identity and
(modulo null sets) for all
. This then gives isomorphisms
for
by setting
. From the above abstract ergodic theorem, we see that
in the strong topology of for any
, where
is the collection of measurable sets
that are essentially
-invariant in the sense that
modulo null sets for all
, and
is the conditional expectation of
with respect to
.
In a similar spirit, we have
Theorem 4 (Convergence of Gowers-Host-Kra seminorms) Let
be a
-system for some additive group
. Let
be a natural number, and for every
, let
, which for simplicity we take to be real-valued. Then the expression
converges, where we write
, and we are using the product direct set on
to define the convergence
. In particular, for
, the limit
converges.
We prove this theorem below the fold. It implies a number of other known descriptions of the Gowers-Host-Kra seminorms , for instance that
for , while from the ergodic theorem we have
This definition also manifestly demonstrates the cube symmetries of the Host-Kra measures on
, defined via duality by requiring that
In a subsequent blog post I hope to present a more detailed study of the norm and its relationship with eigenfunctions and the Kronecker factor, without assuming any amenability on
or any separability or topological structure on
.
Hoi Nguyen, Van Vu, and myself have just uploaded to the arXiv our paper “Random matrices: tail bounds for gaps between eigenvalues“. This is a followup paper to my recent paper with Van in which we showed that random matrices of Wigner type (such as the adjacency matrix of an Erdös-Renyi graph) asymptotically almost surely had simple spectrum. In the current paper, we push the method further to show that the eigenvalues are not only distinct, but are (with high probability) separated from each other by some negative power
of
. This follows the now standard technique of replacing any appearance of discrete Littlewood-Offord theory (a key ingredient in our previous paper) with its continuous analogue (inverse theorems for small ball probability). For general Wigner-type matrices
(in which the matrix entries are not normalised to have mean zero), we can use the inverse Littlewood-Offord theorem of Nguyen and Vu to obtain (under mild conditions on
) a result of the form
for any and
, if
is sufficiently large depending on
(in a linear fashion), and
is sufficiently large depending on
. The point here is that
can be made arbitrarily large, and also that no continuity or smoothness hypothesis is made on the distribution of the entries. (In the continuous case, one can use the machinery of Wegner estimates to obtain results of this type, as was done in a paper of Erdös, Schlein, and Yau.)
In the mean zero case, it becomes more efficient to use an inverse Littlewood-Offord theorem of Rudelson and Vershynin to obtain (with the normalisation that the entries of have unit variance, so that the eigenvalues of
are
with high probability), giving the bound
for (one also has good results of this type for smaller values of
). This is only optimal in the regime
; we expect to establish some eigenvalue repulsion, improving the RHS to
for real matrices and
for complex matrices, but this appears to be a more difficult task (possibly requiring some quadratic inverse Littlewood-Offord theory, rather than just linear inverse Littlewood-Offord theory). However, we can get some repulsion if one works with larger gaps, getting a result roughly of the form
for any fixed and some absolute constant
(which we can asymptotically make to be
for large
, though it ought to be as large as
), by using a higher-dimensional version of the Rudelson-Vershynin inverse Littlewood-Offord theorem.
In the case of Erdös-Renyi graphs, we don’t have mean zero and the Rudelson-Vershynin Littlewood-Offord theorem isn’t quite applicable, but by working carefully through the approach based on the Nguyen-Vu theorem we can almost recover (1), except for a loss of on the RHS.
As a sample applications of the eigenvalue separation results, we can now obtain some information about eigenvectors; for instance, we can show that the components of the eigenvectors all have magnitude at least for some
with high probability. (Eigenvectors become much more stable, and able to be studied in isolation, once their associated eigenvalue is well separated from the other eigenvalues; see this previous blog post for more discussion.)
Recent Comments