You are currently browsing the monthly archive for May 2015.
Here’s a cute identity I discovered by accident recently. Observe that
and so one can conjecture that one has
when is even, and
when is odd. This is obvious in the even case since
is a polynomial of degree
, but I struggled for a while with the odd case before finding a slick three-line proof. (I was first trying to prove the weaker statement that
was non-negative, but for some strange reason I was only able to establish this by working out the derivative exactly, rather than by using more analytic methods, such as convexity arguments.) I thought other readers might like the challenge (and also I’d like to see some other proofs), so rather than post my own proof immediately, I’ll see if anyone would like to supply their own proofs or thoughts in the comments. Also I am curious to know if this identity is connected to any other existing piece of mathematics.
I’ve just uploaded to the arXiv my paper “Cancellation for the multilinear Hilbert transform“, submitted to Collectanea Mathematica. This paper uses methods from additive combinatorics (and more specifically, the arithmetic regularity and counting lemmas from this paper of Ben Green and myself) to obtain a slight amount of progress towards the open problem of obtaining bounds for the trilinear and higher Hilbert transforms (as discussed in this previous blog post). For instance, the trilinear Hilbert transform
is not known to be bounded for any to
, although it is conjectured to do so when
and
. (For
well below
, one can use additive combinatorics constructions to demonstrate unboundedness; see this paper of Demeter.) One can approach this problem by considering the truncated trilinear Hilbert transforms
for . It is not difficult to show that the boundedness of
is equivalent to the boundedness of
with bounds that are uniform in
and
. On the other hand, from Minkowski’s inequality and Hölder’s inequality one can easily obtain the non-uniform bound of
for
. The main result of this paper is a slight improvement of this trivial bound to
as
. Roughly speaking, the way this gain is established is as follows. First there are some standard time-frequency type reductions to reduce to the task of obtaining some non-trivial cancellation on a single “tree”. Using a “generalised von Neumann theorem”, we show that such cancellation will happen if (a discretised version of) one or more of the functions
(or a dual function
that it is convenient to test against) is small in the Gowers
norm. However, the arithmetic regularity lemma alluded to earlier allows one to represent an arbitrary function
, up to a small error, as the sum of such a “Gowers uniform” function, plus a structured function (or more precisely, an irrational virtual nilsequence). This effectively reduces the problem to that of establishing some cancellation in a single tree in the case when all functions
involved are irrational virtual nilsequences. At this point, the contribution of each component of the tree can be estimated using the “counting lemma” from my paper with Ben. The main term in the asymptotics is a certain integral over a nilmanifold, but because the kernel
in the trilinear Hilbert transform is odd, it turns out that this integral vanishes, giving the required cancellation.
The same argument works for higher order Hilbert transforms (and one can also replace the coefficients in these transforms with other rational constants). However, because the quantitative bounds in the arithmetic regularity and counting lemmas are so poor, it does not seem likely that one can use these methods to remove the logarithmic growth in entirely, and some additional ideas will be needed to resolve the full conjecture.
I’ve just uploaded to the arXiv my paper “Failure of the pointwise and maximal ergodic theorems for the free group“, submitted to Forum of Mathematics, Sigma. This paper concerns a variant of the pointwise ergodic theorem of Birkhoff, which asserts that if one has a measure-preserving shift map
on a probability space
, then for any
, the averages
converge pointwise almost everywhere. (In the important case when the shift map
is ergodic, the pointwise limit is simply the mean
of the original function
.)
The pointwise ergodic theorem can be extended to measure-preserving actions of other amenable groups, if one uses a suitably “tempered” Folner sequence of averages; see this paper of Lindenstrauss for more details. (I also wrote up some notes on that paper here, back in 2006 before I had started this blog.) But the arguments used to handle the amenable case break down completely for non-amenable groups, and in particular for the free non-abelian group on two generators.
Nevo and Stein studied this problem and obtained a number of pointwise ergodic theorems for -actions
on probability spaces
. For instance, for the spherical averaging operators
(where denotes the length of the reduced word that forms
), they showed that
converged pointwise almost everywhere provided that
was in
for some
. (The need to restrict to spheres of even radius can be seen by considering the action of
on the two-element set
in which both generators of
act by interchanging the elements, in which case
is determined by the parity of
.) This result was reproven with a different and simpler proof by Bufetov, who also managed to relax the condition
to the weaker condition
.
The question remained open as to whether the pointwise ergodic theorem for -actions held if one only assumed that
was in
. Nevo and Stein were able to establish this for the Cesáro averages
, but not for
itself. About six years ago, Assaf Naor and I tried our hand at this problem, and was able to show an associated maximal inequality on
, but due to the non-amenability of
, this inequality did not transfer to
and did not have any direct impact on this question, despite a fair amount of effort on our part to attack it.
Inspired by some recent conversations with Lewis Bowen, I returned to this problem. This time around, I tried to construct a counterexample to the pointwise ergodic theorem – something Assaf and I had not seriously attempted to do (perhaps due to being a bit too enamoured of our
maximal inequality). I knew of an existing counterexample of Ornstein regarding a failure of an
ergodic theorem for iterates
of a self-adjoint Markov operator – in fact, I had written some notes on this example back in 2007. Upon revisiting my notes, I soon discovered that the Ornstein construction was adaptable to the
setting, thus settling the problem in the negative:
Theorem 1 (Failure of
pointwise ergodic theorem) There exists a measure-preserving
-action on a probability space
and a non-negative function
such that
for almost every
.
To describe the proof of this theorem, let me first briefly sketch the main ideas of Ornstein’s construction, which gave an example of a self-adjoint Markov operator on a probability space
and a non-negative
such that
for almost every
. By some standard manipulations, it suffices to show that for any given
and
, there exists a self-adjoint Markov operator
on a probability space
and a non-negative
with
, such that
on a set of measure at least
. Actually, it will be convenient to replace the Markov chain
with an ancient Markov chain
– that is to say, a sequence of non-negative functions
for both positive and negative
, such that
for all
. The purpose of requiring the Markov chain to be ancient (that is, to extend infinitely far back in time) is to allow for the Markov chain to be shifted arbitrarily in time, which is key to Ornstein’s construction. (Technically, Ornstein’s original argument only uses functions that go back to a large negative time, rather than being infinitely ancient, but I will gloss over this point for sake of discussion, as it turns out that the
version of the argument can be run using infinitely ancient chains.)
For any , let
denote the claim that for any
, there exists an ancient Markov chain
with
such that
on a set of measure at least
. Clearly
holds since we can just take
for all
. Our objective is to show that
holds for arbitrarily small
. The heart of Ornstein’s argument is then the implication
for any , which upon iteration quickly gives the desired claim.
Let’s see informally how (1) works. By hypothesis, and ignoring epsilons, we can find an ancient Markov chain on some probability space
of total mass
, such that
attains the value of
or greater almost everywhere. Assuming that the Markov process is irreducible, the
will eventually converge as
to the constant value of
, in particular its final state will essentially stay above
(up to small errors).
Now suppose we duplicate the Markov process by replacing with a double copy
(giving
the uniform probability measure), and using the disjoint sum of the Markov operators on
and
as the propagator, so that there is no interaction between the two components of this new system. Then the functions
form an ancient Markov chain of mass at most
that lives solely in the first half
of this copy, and
attains the value of
or greater on almost all of the first half
, but is zero on the second half. The final state of
will be to stay above
in the first half
, but be zero on the second half.
Now we modify the above example by allowing an infinitesimal amount of interaction between the two halves ,
of the system (I mentally think of
and
as two identical boxes that a particle can bounce around in, and now we wish to connect the boxes by a tiny tube). The precise way in which this interaction is inserted is not terribly important so long as the new Markov process is irreducible. Once one does so, then the ancient Markov chain
in the previous example gets replaced by a slightly different ancient Markov chain
which is more or less identical with
for negative times
, or for bounded positive times
, but for very large values of
the final state is now constant across the entire state space
, and will stay above
on this space.
Finally, we consider an ancient Markov chain which is basically of the form
for some large parameter and for all
(the approximation becomes increasingly inaccurate for
much larger than
, but never mind this for now). This is basically two copies of the original Markov process in separate, barely interacting state spaces
, but with the second copy delayed by a large time delay
, and also attenuated in amplitude by a factor of
. The total mass of this process is now
. Because of the
component of
, we see that
basically attains the value of
or greater on the first half
. On the second half
, we work with times
close to
. If
is large enough,
would have averaged out to about
at such times, but the
component can get as large as
here. Summing (and continuing to ignore various epsilon losses), we see that
can get as large as
on almost all of the second half of
. This concludes the rough sketch of how one establishes the implication (1).
It was observed by Bufetov that the spherical averages for a free group action can be lifted up to become powers
of a Markov operator, basically by randomly assigning a “velocity vector”
to one’s base point
and then applying the Markov process that moves
along that velocity vector (and then randomly changing the velocity vector at each time step to the “reduced word” condition that the velocity never flips from
to
). Thus the spherical average problem has a Markov operator interpretation, which opens the door to adapting the Ornstein construction to the setting of
systems. This turns out to be doable after a certain amount of technical artifice; the main thing is to work with
-measure preserving systems that admit ancient Markov chains that are initially supported in a very small region in the “interior” of the state space, so that one can couple such systems to each other “at the boundary” in the fashion needed to establish the analogue of (1) without disrupting the ancient dynamics of such chains. The initial such system (used to establish the base case
) comes from basically considering the action of
on a (suitably renormalised) “infinitely large ball” in the Cayley graph, after suitably gluing together the boundary of this ball to complete the action. The ancient Markov chain associated to this system starts at the centre of this infinitely large ball at infinite negative time
, and only reaches the boundary of this ball at the time
.
The lonely runner conjecture is the following open problem:
Conjecture 1 Suppose one has
runners on the unit circle
, all starting at the origin and moving at different speeds. Then for each runner, there is at least one time
for which that runner is “lonely” in the sense that it is separated by a distance at least
from all other runners.
One can normalise the speed of the lonely runner to be zero, at which point the conjecture can be reformulated (after replacing by
) as follows:
Conjecture 2 Let
be non-zero real numbers for some
. Then there exists a real number
such that the numbers
are all a distance at least
from the integers, thus
where
denotes the distance of
to the nearest integer.
This conjecture has been proven for , but remains open for larger
. The bound
is optimal, as can be seen by looking at the case
and applying the Dirichlet approximation theorem. Note that for each non-zero
, the set
has (Banach) density
for any
, and from this and the union bound we can easily find
for which
for any , but it has proven to be quite challenging to remove the factor of
to increase
to
. (As far as I know, even improving
to
for some absolute constant
and sufficiently large
remains open.)
The speeds in the above conjecture are arbitrary non-zero reals, but it has been known for some time that one can reduce without loss of generality to the case when the
are rationals, or equivalently (by scaling) to the case where they are integers; see e.g. Section 4 of this paper of Bohman, Holzman, and Kleitman.
In this post I would like to remark on a slight refinement of this reduction, in which the speeds are integers of bounded size, where the bound depends on
. More precisely:
Proposition 3 In order to prove the lonely runner conjecture, it suffices to do so under the additional assumption that the
are integers of size at most
, where
is an (explicitly computable) absolute constant. (More precisely: if this restricted version of the lonely runner conjecture is true for all
, then the original version of the conjecture is also true for all
.)
In principle, this proposition allows one to verify the lonely runner conjecture for a given in finite time; however the number of cases to check with this proposition grows faster than exponentially in
, and so this is unfortunately not a feasible approach to verifying the lonely runner conjecture for more values of
than currently known.
One of the key tools needed to prove this proposition is the following additive combinatorics result. Recall that a generalised arithmetic progression (or ) in the reals
is a set of the form
for some and
; the quantity
is called the rank of the progression. If
, the progression
is said to be
-proper if the sums
with
for
are all distinct. We have
Lemma 4 (Progressions lie inside proper progressions) Let
be a GAP of rank
in the reals, and let
. Then
is contained in a
-proper GAP
of rank at most
, with
Proof: See Theorem 2.1 of this paper of Bilu. (Very similar results can also be found in Theorem 3.40 of my book with Van Vu, or Theorem 1.10 of this paper of mine with Van Vu.)
Now let , and assume inductively that the lonely runner conjecture has been proven for all smaller values of
, as well as for the current value of
in the case that
are integers of size at most
for some sufficiently large
. We will show that the lonely runner conjecture holds in general for this choice of
.
let be non-zero real numbers. Let
be a large absolute constant to be chosen later. From the above lemma applied to the GAP
, one can find a
-proper GAP
of rank at most
containing
such that
in particular if
is large enough depending on
.
We write
for some ,
, and
. We thus have
for
, where
is the linear map
and
are non-zero and lie in the box
.
We now need an elementary lemma that allows us to create a “collision” between two of the via a linear projection, without making any of the
collide with the origin:
Lemma 5 Let
be non-zero vectors that are not all collinear with the origin. Then, after replacing one or more of the
with their negatives
if necessary, there exists a pair
such that
, and such that none of the
is a scalar multiple of
.
Proof: We may assume that , since the
case is vacuous. Applying a generic linear projection to
(which does not affect collinearity, or the property that a given
is a scalar multiple of
), we may then reduce to the case
.
By a rotation and relabeling, we may assume that lies on the negative
-axis; by flipping signs as necessary we may then assume that all of the
lie in the closed right half-plane. As the
are not all collinear with the origin, one of the
lies off of the
-axis, by relabeling, we may assume that
lies off of the
axis and makes a minimal angle with the
-axis. Then the angle of
with the
-axis is non-zero but smaller than any non-zero angle that any of the
make with this axis, and so none of the
are a scalar multiple of
, and the claim follows.
We now return to the proof of the proposition. If the are all collinear with the origin, then
lie in a one-dimensional arithmetic progression
, and then by rescaling we may take the
to be integers of magnitude at most
, at which point we are done by hypothesis. Thus, we may assume that the
are not all collinear with the origin, and so by the above lemma and relabeling we may assume that
is non-zero, and that none of the
are scalar multiples of
.
with for
; by relabeling we may assume without loss of generality that
is non-zero, and furthermore that
where is a natural number and
have no common factor.
We now define a variant of
by the map
where the are real numbers that are linearly independent over
, whose precise value will not be of importance in our argument. This is a linear map with the property that
, so that
consists of at most
distinct real numbers, which are non-zero since none of the
are scalar multiples of
, and the
are linearly independent over
. As we are assuming inductively that the lonely runner conjecture holds for
, we conclude (after deleting duplicates) that there exists at least one real number
such that
We would like to “approximate” by
to then conclude that there is at least one real number
such that
It turns out that we can do this by a Fourier-analytic argument taking advantage of the -proper nature of
. Firstly, we see from the Dirichlet approximation theorem that one has
for a set of reals of (Banach) density
. Thus, by the triangle inequality, we have
for a set of reals of density
.
Applying a smooth Fourier multiplier of Littlewood-Paley type, one can find a trigonometric polynomial
which takes values in , is
for
, and is no larger than
for
. We then have
where denotes the mean value of a quasiperiodic function
on the reals
. We expand the left-hand side out as
From the genericity of , we see that the constraint
occurs if and only if is a scalar multiple of
, or equivalently (by (1), (2)) an integer multiple of
. Thus
and is the Dirichlet series
By Fourier expansion and writing , we may write (4) as
The support of the implies that
. Because of the
-properness of
, we see (for
large enough) that the equation
and conversely that (7) implies that (6) holds for some with
. From (3) we thus have
In particular, there exists a such that
Since is bounded in magnitude by
, and
is bounded by
, we thus have
for each , which by the size properties of
implies that
for all
, giving the lonely runner conjecture for
.
Because of Euler’s identity , the complex exponential is not injective:
for any complex
and integer
. As such, the complex logarithm
is not well-defined as a single-valued function from
to
. However, after making a branch cut, one can create a branch of the logarithm which is single-valued. For instance, after removing the negative real axis
, one has the standard branch
of the logarithm, with
defined as the unique choice of the complex logarithm of
whose imaginary part has magnitude strictly less than
. This particular branch has a number of useful additional properties:
- The standard branch
is holomorphic on its domain
.
- One has
for all
in the domain
. In particular, if
is real, then
is real.
- One has
for all
in the domain
.
One can then also use the standard branch of the logarithm to create standard branches of other multi-valued functions, for instance creating a standard branch of the square root function. We caution however that the identity
can fail for the standard branch (or indeed for any branch of the logarithm).
One can extend this standard branch of the logarithm to complex matrices, or (equivalently) to linear transformations
on an
-dimensional complex vector space
, provided that the spectrum of that matrix or transformation avoids the branch cut
. Indeed, from the spectral theorem one can decompose any such
as the direct sum of operators
on the non-trivial generalised eigenspaces
of
, where
ranges in the spectrum of
. For each component
of
, we define
where is the Taylor expansion of
at
; as
is nilpotent, only finitely many terms in this Taylor expansion are required. The logarithm
is then defined as the direct sum of the
.
The matrix standard branch of the logarithm has many pleasant and easily verified properties (often inherited from their scalar counterparts), whenever has no spectrum in
:
- (i) We have
.
- (ii) If
and
have no spectrum in
, then
.
- (iii) If
has spectrum in a closed disk
in
, then
, where
is the Taylor series of
around
(which is absolutely convergent in
).
- (iv)
depends holomorphically on
. (Easily established from (ii), (iii), after covering the spectrum of
by disjoint disks; alternatively, one can use the Cauchy integral representation
for a contour
in the domain enclosing the spectrum of
.) In particular, the standard branch of the matrix logarithm is smooth.
- (v) If
is any invertible linear or antilinear map, then
. In particular, the standard branch of the logarithm commutes with matrix conjugations; and if
is real with respect to a complex conjugation operation on
(that is to say, an antilinear involution), then
is real also.
- (vi) If
denotes the transpose of
(with
the complex dual of
), then
. Similarly, if
denotes the adjoint of
(with
the complex conjugate of
, i.e.
with the conjugated multiplication map
), then
.
- (vii) One has
.
- (viii) If
denotes the spectrum of
, then
.
As a quick application of the standard branch of the matrix logarithm, we have
Proposition 1 Let
be one of the following matrix groups:
,
,
,
,
, or
, where
is a non-degenerate real quadratic form (so
is isomorphic to a (possibly indefinite) orthogonal group
for some
. Then any element
of
whose spectrum avoids
is exponential, that is to say
for some
in the Lie algebra
of
.
Proof: We just prove this for , as the other cases are similar (or a bit simpler). If
, then (viewing
as a complex-linear map on
, and using the complex bilinear form associated to
to identify
with its complex dual
, then
is real and
. By the properties (v), (vi), (vii) of the standard branch of the matrix logarithm, we conclude that
is real and
, and so
lies in the Lie algebra
, and the claim now follows from (i).
Exercise 2 Show that
is not exponential in
if
. Thus we see that the branch cut in the above proposition is largely necessary. See this paper of Djokovic for a more complete description of the image of the exponential map in classical groups, as well as this previous blog post for some more discussion of the surjectivity (or lack thereof) of the exponential map in Lie groups.
For a slightly less quick application of the standard branch, we have the following result (recently worked out in the answers to this MathOverflow question):
Proposition 3 Let
be an element of the split orthogonal group
which lies in the connected component of the identity. Then
.
The requirement that lie in the identity component is necessary, as the counterexample
for
shows.
Proof: We think of as a (real) linear transformation on
, and write
for the quadratic form associated to
, so that
. We can split
, where
is the sum of all the generalised eigenspaces corresponding to eigenvalues in
, and
is the sum of all the remaining eigenspaces. Since
and
are real,
are real (i.e. complex-conjugation invariant) also. For
, the restriction
of
to
then lies in
, where
is the restriction of
to
, and
The spectrum of consists of positive reals, as well as complex pairs
(with equal multiplicity), so
. From the preceding proposition we have
for some
; this will be important later.
It remains to show that . If
has spectrum at
then we are done, so we may assume that
has spectrum only at
(being invertible,
has no spectrum at
). We split
, where
correspond to the portions of the spectrum in
,
; these are real,
-invariant spaces. We observe that if
are generalised eigenspaces of
with
, then
are orthogonal with respect to the (complex-bilinear) inner product
associated with
; this is easiest to see first for the actual eigenspaces (since
for all
), and the extension to generalised eigenvectors then follows from a routine induction. From this we see that
is orthogonal to
, and
and
are null spaces, which by the non-degeneracy of
(and hence of the restriction
of
to
) forces
to have the same dimension as
, indeed
now gives an identification of
with
. If we let
be the restrictions of
to
, we thus identify
with
, since
lies in
; in particular
is invertible. Thus
and so it suffices to show that .
At this point we need to use the hypothesis that lies in the identity component of
. This implies (by a continuity argument) that the restriction of
to any maximal-dimensional positive subspace has positive determinant (since such a restriction cannot be singular, as this would mean that
positive norm vector would map to a non-positive norm vector). Now, as
have equal dimension,
has a balanced signature, so
does also. Since
,
already lies in the identity component of
, and so has positive determinant on any maximal-dimensional positive subspace of
. We conclude that
has positive determinant on any maximal-dimensional positive subspace of
.
We choose a complex basis of , to identify
with
, which has already been identified with
. (In coordinates,
are now both of the form
, and
for
.) Then
becomes a maximal positive subspace of
, and the restriction of
to this subspace is conjugate to
, so that
But since and
is positive definite, so
as required.
Recent Comments