You are currently browsing the monthly archive for February 2012.
In the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, namely quasirandomness and product theorems, leaving only the non-concentration ingredient to discuss. We can summarise the results of the last three notes, in the case of fields of prime order, as the following theorem.
Theorem 1 (Non-concentration implies expansion in
) Let
be a prime, let
, and let
be a symmetric set of elements in
of cardinality
not containing the identity. Write
, and suppose that one has the non-concentration property
for some
and some even integer
. Then
is a two-sided
-expander for some
depending only on
.
Proof: From (1) we see that is not supported in any proper subgroup
of
, which implies that
generates
. The claim now follows from the Bourgain-Gamburd expansion machine (Theorem 2 of Notes 4), the product theorem (Theorem 1 of Notes 5), and quasirandomness (Exercise 8 of Notes 3).
Remark 1 The same argument also works if we replace
by the field
of order
for some bounded
. However, there is a difficulty in the regime when
is unbounded, because the quasirandomness property becomes too weak for the Bourgain-Gamburd expansion machine to be directly applicable. On theother hand, the above type of theorem was generalised to the setting of cyclic groups
with
square-free by Varju, to arbitrary
by Bourgain and Varju, and to more general algebraic groups than
and square-free
by Salehi Golsefidy and Varju. It may be that some modification of the proof techniques in these papers may also be able to handle the field case
with unbounded
.
It thus remains to construct tools that can establish the non-concentration property (1). The situation is particularly simple in , as we have a good understanding of the subgroups of that group. Indeed, from Theorem 14 from Notes 5, we obtain the following corollary to Theorem 1:
Corollary 2 (Non-concentration implies expansion in
) Let
be a prime, and let
be a symmetric set of elements in
of cardinality
not containing the identity. Write
, and suppose that one has the non-concentration property
for some
and some even integer
, where
ranges over all Borel subgroups of
. Then, if
is sufficiently large depending on
,
is a two-sided
-expander for some
depending only on
.
It turns out (2) can be verified in many cases by exploiting the solvable nature of the Borel subgroups . We give two examples of this in these notes. The first result, due to Bourgain and Gamburd (with earlier partial results by Gamburd and by Shalom) generalises Selberg’s expander construction to the case when
generates a thin subgroup of
:
Theorem 3 (Expansion in thin subgroups) Let
be a symmetric subset of
not containing the identity, and suppose that the group
generated by
is not virtually solvable. Then as
ranges over all sufficiently large primes, the Cayley graphs
form a two-sided expander family, where
is the usual projection.
Remark 2 One corollary of Theorem 3 (or of the non-concentration estimate (3) below) is that
generates
for all sufficiently large
, if
is not virtually solvable. This is a special case of a much more general result, known as the strong approximation theorem, although this is certainly not the most direct way to prove such a theorem. Conversely, the strong approximation property is used in generalisations of this result to higher rank groups than
.
Exercise 1 In the converse direction, if
is virtually solvable, show that for sufficiently large
,
fails to generate
. (Hint: use Theorem 14 from Notes 5 to prevent
from having bounded index solvable subgroups.)
Exercise 2 (Lubotzsky’s 1-2-3 problem) Let
.
- (i) Show that
generates a free subgroup of
. (Hint: use a ping-pong argument, as in Exercise 23 of Notes 2.)
- (ii) Show that if
are two distinct elements of the sector
, then there os no element
for which
. (Hint: this is another ping-pong argument.) Conclude that
has infinite index in
. (Contrast this with the situation in which the
coefficients in
are replaced by
or
, in which case
is either all of
, or a finite index subgroup, as demonstrated in Exercise 23 of Notes 2).
- (iii) Show that
for sufficiently large primes
form a two-sided expander family.
Remark 3 Theorem 3 has been generalised to arbitrary linear groups, and with
replaced by
for square-free
; see this paper of Salehi Golsefidy and Varju. In this more general setting, the condition of virtual solvability must be replaced by the condition that the connected component of the Zariski closure of
is perfect. An effective version of Theorem 3 (with completely explicit constants) was recently obtained by Kowalski.
The second example concerns Cayley graphs constructed using random elements of .
Theorem 4 (Random generators expand) Let
be a prime, and let
be two elements of
chosen uniformly at random. Then with probability
,
is a two-sided
-expander for some absolute constant
.
Remark 4 As with Theorem 3, Theorem 4 has also been extended to a number of other groups, such as the Suzuki groups (in this paper of Breuillard, Green, and Tao), and more generally to finite simple groups of Lie type of bounded rank (in forthcoming work of Breuillard, Green, Guralnick, and Tao). There are a number of other constructions of expanding Cayley graphs in such groups (and in other interesting groups, such as the alternating groups) beyond those discussed in these notes; see this recent survey of Lubotzky for further discussion. It has been conjectured by Lubotzky and Weiss that any pair
of (say)
that generates the group, is a two-sided
-expander for an absolute constant
: in the case of
, this has been established for a density one set of primes by Breuillard and Gamburd.
— 1. Expansion in thin subgroups —
We now prove Theorem 3. The first observation is that the expansion property is monotone in the group :
Exercise 3 Let
be symmetric subsets of
not containing the identity, such that
. Suppose that
is a two-sided expander family for sufficiently large primes
. Show that
is also a two-sided expander family.
As a consequence, Theorem 3 follows from the following two statments:
Theorem 5 (Tits alternative) Let
be a group. Then exactly one of the following statements holds:
- (i)
is virtually solvable.
- (ii)
contains a copy of the free group
of two generators as a subgroup.
Theorem 6 (Expansion in free groups) Let
be generators of a free subgroup of
. Then as
ranges over all sufficiently large primes, the Cayley graphs
form a two-sided expander family.
Theorem 5 is a special case of the famous Tits alternative, which among other things allows one to replace by
for any
and any field
of characteristic zero (and fields of positive characteristic are also allowed, if one adds the requirement that
be finitely generated). We will not prove the full Tits alternative here, but instead just give an ad hoc proof of the special case in Theorem 5 in the following exercise.
Exercise 4 Given any matrix
, the singular values are
and
, and we can apply the singular value decomposition to decompose
where
and
are orthonormal bases. (When
, these bases are uniquely determined up to phase rotation.) We let
be the projection of
to the projective complex plane, and similarly define
.
Let
be a subgroup of
. Call a pair
a limit point of
if there exists a sequence
with
and
.
- (i) Show that if
is infinite, then there is at least one limit point.
- (ii) Show that if
is a limit point, then so is
.
- (iii) Show that if there are two limit points
with
, then there exist
that generate a free group. (Hint: Choose
close to
and
close to
, and consider the action of
and
on
, and specifically on small neighbourhoods of
, and set up a ping-pong type situation.)
- (iv) Show that if
is hyperbolic (i.e. it has an eigenvalue greater than 1), with eigenvectors
, then the projectivisations
of
form a limit point. Similarly, if
is regular parabolic (i.e. it has an eigenvalue at 1, but is not the identity) with eigenvector
, show that
is a limit point.
- (v) Show that if
has no free subgroup of two generators, then all hyperbolic and regular parabolic elements of
have a common eigenvector. Conclude that all such elements lie in a solvable subgroup of
.
- (vi) Show that if an element
is neither hyperbolic nor regular parabolic, and is not a multiple of the identity, then
is conjugate to a rotation by
(in particular,
).
- (vii) Establish Theorem 5. (Hint: show that two square roots of
in
cannot multiply to another square root of
.)
Now we prove Theorem 6. Let be a free subgroup of
generated by two generators
. Let
be the probability measure generating a random walk on
, thus
is the corresponding generator on
. By Corollary 2, it thus suffices to show that
for all sufficiently large , some absolute constant
, and some even
(depending on
, of course), where
ranges over Borel subgroups.
As is a homomorphism, one has
and so it suffices to show that
To deal with the supremum here, we will use an argument of Bourgain and Gamburd, taking advantage of the fact that all Borel groups of obey a common group law, the point being that free groups such as
obey such laws only very rarely. More precisely, we use the fact that the Borel groups are solvable of derived length two; in particular we have
for all . Now,
is supported on matrices in
whose coefficients have size
(where we allow the implied constants to depend on the choice of generators
), and so
is supported on matrices in
whose coefficients also have size
. If
is less than a sufficiently small multiple of
, these coefficients are then less than
(say). As such, if
lie in the support of
and their projections
obey the word law (4) in
, then the original matrices
obey the word law (4) in
. (This lifting of identities from the characteristic
setting of
to the characteristic
setting of
is a simple example of the “Lefschetz principle”.)
To summarise, if we let be the set of all elements of
that lie in the support of
, then (4) holds for all
. This severely limits the size of
to only be of polynomial size, rather than exponential size:
Proposition 7 Let
be a subset of the support of
(thus,
consists of words in
of length
) such that the law (4) holds for all
. Then
.
The proof of this proposition is laid out in the exercise below.
Exercise 5 Let
be a free group generated by two generators
. Let
be the set of all words of length at most
in
.
- (i) Show that if
commute, then
lie in the same cyclic group, thus
for some
and
.
- (ii) Show that if
, there are at most
elements of
that commute with
.
- (iii) Show that if
, there are at most
elements
of
with
.
- (iv) Prove Proposition 7.
Now we can conclude the proof of Theorem 3:
Exercise 6 Let
be a free group generated by two generators
.
- (i) Show that
for some absolute constant
. (For much more precise information on
, see this paper of Kesten.)
- (ii) Conclude the proof of Theorem 3.
— 2. Random generators expand —
We now prove Theorem 4. Let be the free group on two formal generators
, and let
be the generator of the random walk. For any word
and any
in a group
, let
be the element of
formed by substituting
for
respectively in the word
; thus
can be viewed as a map
for any group
. Observe that if
is drawn randomly using the distribution
, and
, then
is distributed according to the law
, where
. Applying Corollary 2, it suffices to show that whenever
is a large prime and
are chosen uniformly and independently at random from
, that with probability
, one has
for some absolute constant , where
ranges over all Borel subgroups of
and
is drawn from the law
for some even natural number
.
Let denote the words in
of length at most
. We may use the law (4) to obtain good bound on the supremum in (5) assuming a certain non-degeneracy property of the word evaluations
:
Exercise 7 Let
be a natural number, and suppose that
is such that
for
. Show that
for some absolute constant
, where
is drawn from the law
. (Hint: use (4) and the hypothesis to lift the problem up to
, at which point one can use Proposition 7 and Exercise 6.)
In view of this exercise, it suffices to show that with probability , one has
for all
for some
comparable to a small multiple of
. As
has
elements, it thus suffices by the union bound to show that
for some absolute constant , and any
of length less than
for some sufficiently small absolute constant
.
Let us now fix a non-identity word of length
less than
, and consider
as a function from
to
for an arbitrary field
. We can identify
with the set
. A routine induction then shows that the expression
is then a polynomial in the eight variables
of degree
and coefficients which are integers of size
. Let us then make the additional restriction to the case
, in which case we can write
and
. Then
is now a rational function of
whose numerator is a polynomial of degree
and coefficients of size
, and the denominator is a monomial of
of degree
.
We then specialise this rational function to the field . It is conceivable that when one does so, the rational function collapses to the constant polynomial
, thus
for all
with
. (For instance, this would be the case if
, by Lagrange’s theorem, if it were not for the fact that
is far too large here.) But suppose that this rational function does not collapse to the constant rational function. Applying the Schwarz-Zippel lemma (Exercise 23 from Notes 5), we then see that the set of pairs
with
and
is at most
; adding in the
and
cases, one still obtains a bound of
, which is acceptable since
and
. Thus, the only remaining case to consider is when the rational function
is identically
on
with
.
Now we perform another “Lefschetz principle” maneuvre to change the underlying field. Recall that the denominator of rational function is monomial in
, and the numerator has coefficients of size
. If
is less than
for a sufficiently small
, we conclude in particular (for
large enough) that the coefficients all have magnitude less than
. As such, the only way that this function can be identically
on
is if it is identically
on
for all
with
, and hence for
or
also by taking Zariski closures.
On the other hand, we know that for some choices of , e.g.
,
contains a copy
of the free group on two generators (see e.g. Exercise 23 of Notes 2). As such, it is not possible for any non-identity word
to be identically trivial on
. Thus this case cannot actually occur, completing the proof of (6) and hence of Theorem 4.
Remark 5 We see from the above argument that the existence of subgroups
of an algebraic group with good “independence” properties – such as that of generating a free group – can be useful in studying the expansion properties of that algebraic group, even if the field of interest in the latter is distinct from that of the former. For more complicated algebraic groups than
, in which laws such as (4) are not always available, it turns out to be useful to place further properties on the subgroup
, for instance by requiring that all non-abelian subgroups of that group be Zariski dense (a property which has been called strong density), as this turns out to be useful for preventing random walks from concentrating in proper algebraic subgroups. See this paper of Breuillard, Guralnick, Green and Tao for constructions of strongly dense free subgroups of algebraic groups and further discussion.
It has been a little over two weeks now since the protest site at thecostofknowledge.com was set up to register declarations of non-cooperation with Reed Elsevier in protest of their research publishing practices, inspired by this blog post of Tim Gowers. Awareness of the protest has certainly grown in these two weeks; the number of signatories is now well over four thousand, across a broad array of academic disciplines, and the protest has been covered by many blogs and also the mainstream media (e.g. the Guardian, the Economist, Forbes, etc.), and even by Elsevier stock analysts. (Elsevier itself released an open letter responding to the protest here.) My interpretation of events is that there was a significant amount of latent or otherwise undisclosed dissatisfaction already with the publishing practices of Elsevier (and, to a lesser extent, some other commercial academic publishers), and a desire to support alternatives such as university or society publishers, and the more recent open access journals; and that this protest (and parallel protests, such as the movement to oppose the Research Works Act) served to drive these feelings out into the open.
The statement of the protest itself, though, is rather brief, reflecting the improvised manner in which the site was created. A group of mathematicians including myself therefore decided to write and sign a more detailed explanation of why we supported this protest, giving more background and references to support our position. The 34 signatories are Scott Aaronson, Douglas N. Arnold, Artur Avila, John Baez, Folkmar Bornemann, Danny Calegari, Henry Cohn, Ingrid Daubechies, Jordan Ellenberg, Matthew Emerton, Marie Farge, David Gabai, Timothy Gowers, Ben Green, Martin Grotschel, Michael Harris, Frederic Helein, Rob Kirby, Vincent Lafforgue, Gregory F. Lawler, Randall J. LeVeque, Laszlo Lovasz, Peter J. Olver, Olof Sisask, Richard Taylor, Bernard Teissier, Burt Totaro, Lloyd N. Trefethen, Takashi Tsuboi, Marie-France Vigneras, Wendelin Werner, Amie Wilkinson, Gunter M. Ziegler, and myself. (Note that while Daubechies is current president of the International Mathematical Union, Lovasz is a past president, and Grotschel is the current secretary, they are signing this letter as individuals and not as representatives of the IMU. Similarly for Trefethen and Arnold (current and past president of SIAM).)
Of course, the 34 of us do not presume to speak for the remaining four thousand signatories to the protest, but I hope that our statement is somewhat representative of the position of many of its supporters.
Further discussion of this statement can be found at this blog post of Tim Gowers.
EDIT: I think it is appropriate to quote the following excerpt from our statement:
All mathematicians must decide for themselves whether, or to what extent, they wish to participate in the boycott. Senior mathematicians who have signed the boycott bear some responsibility towards junior colleagues who are forgoing the option of publishing in Elsevier journals, and should do their best to help minimize any negative career consequences.
Whether or not you decide to join the boycott, there are some simple actions that everyone can take, which seem to us to be uncontroversial:
- Make sure that the final versions of all your papers, particularly new ones, are freely available online, ideally both on the arXiv and on your home page.
- If you are submitting a paper and there is a choice between an expensive journal and a cheap (or free) journal of the same standard, then always submit to the cheap one.
In the previous set of notes, we saw that one could derive expansion of Cayley graphs from three ingredients: non-concentration, product theorems, and quasirandomness. Quasirandomness was discussed in Notes 3. In the current set of notes, we discuss product theorems. Roughly speaking, these theorems assert that in certain circumstances, a finite subset of a group
either exhibits expansion (in the sense that
, say, is significantly larger than
), or is somehow “close to” or “trapped” by a genuine group.
Theorem 1 (Product theorem in
) Let
, let
be a finite field, and let
be a finite subset of
. Let
be sufficiently small depending on
. Then at least one of the following statements holds:
- (Expansion) One has
.
- (Close to
) One has
.
- (Trapping)
is contained in a proper subgroup of
.
We will prove this theorem (which was proven first in the cases for fields
of prime order by Helfgott, and then for
and general
by Dinai, and finally to general
and
independently by Pyber-Szabo and by Breuillard-Green-Tao) later in this notes. A more qualitative version of this proposition was also previously obtained by Hrushovski. There are also generalisations of the product theorem of importance to number theory, in which the field
is replaced by a cyclic ring
(with
not necessarily prime); this was achieved first for
and
square-free by Bourgain, Gamburd, and Sarnak, by Varju for general
and
square-free, and finally by this paper of Bourgain and Varju for arbitrary
and
.
Exercise 1 (Diameter bound) Assuming Theorem 1, show that whenever
is a symmetric set of generators of
for some finite field
and some
, then any element of
can be expressed as the product of
elements from
. (Equivalently, if we add the identity element to
, then
for some
.) This is a special case of a conjecture of Babai and Seress, who conjectured that the bound should hold uniformly for all finite simple groups (in particular, the implied constants here should not actually depend on
. The methods used to handle the
case can handle other finite groups of Lie type of bounded rank, but at present we do not have bounds that are independent of the rank. On the other hand, a recent paper of Helfgott and Seress has almost resolved the conjecture for the permutation groups
.
A key tool to establish product theorems is an argument which is sometimes referred to as the pivot argument. To illustrate this argument, let us first discuss a much simpler (and older) theorem, essentially due to Freiman, which has a much weaker conclusion but is valid in any group :
Theorem 2 (Baby product theorem) Let
be a group, and let
be a finite non-empty subset of
. Then one of the following statements hold:
- (Expansion) One has
.
- (Close to a subgroup)
is contained in a left-coset of a group
with
.
To prove this theorem, we suppose that the first conclusion does not hold, thus . Our task is then to place
inside the left-coset of a fairly small group
.
To do this, we take a group element , and consider the intersection
. A priori, the size of this set could range from anywhere from
to
. However, we can use the hypothesis
to obtain an important dichotomy, reminiscent of the classical fact that two cosets
of a subgroup
of
are either identical or disjoint:
Proposition 3 (Dichotomy) If
, then exactly one of the following occurs:
- (Non-involved case)
is empty.
- (Involved case)
.
Proof: Suppose we are not in the pivot case, so that is non-empty. Let
be an element of
, then
and
both lie in
. The sets
and
then both lie in
. As these sets have cardinality
and lie in
, which has cardinality less than
, we conclude from the inclusion-exclusion formula that
But the left-hand side is equal to , and the claim follows.
The above proposition provides a clear separation between two types of elements : the “non-involved” elements, which have nothing to do with
(in the sense that
, and the “involved” elements, which have a lot to do with
(in the sense that
. The key point is that there is a significant “gap” between the non-involved and involved elements; there are no elements that are only “slightly involved”, in that
and
intersect a little but not a lot. It is this gap that will allow us to upgrade approximate structure to exact structure. Namely,
Proposition 4 The set
of involved elements is a finite group, and is equal to
.
Proof: It is clear that the identity element is involved, and that if
is involved then so is
(since
. Now suppose that
are both involved. Then
and
have cardinality greater than
and are both subsets of
, and so have non-empty intersection. In particular,
is non-empty, and so
is non-empty. By Proposition 3, this makes
involved. It is then clear that
is a group.
If , then
is non-empty, and so from Proposition 3
is involved. Conversely, if
is involved, then
. Thus we have
as claimed. In particular,
is finite.
Now we can quickly wrap up the proof of Theorem 2. By construction, for all
,which by double counting shows that
. As
, we see that
is contained in a right coset
of
; setting
, we conclude that
is contained in a left coset
of
.
is a conjugate of
, and so
. If
, then
and
both lie in
and have cardinality
, so must overlap; and so
. Thus
, and so
, and Theorem 2 follows.
Exercise 2 Show that the constant
in Theorem 2 cannot be replaced by any larger constant.
Exercise 3 Let
be a finite non-empty set such that
. Show that
. (Hint: If
, show that
for some
.)
Exercise 4 Let
be a finite non-empty set such that
. Show that there is a finite group
with
and a group element
such that
and
.
Below the fold, we give further examples of the pivot argument in other group-like situations, including Theorem 2 and also the “sum-product theorem” of Bourgain-Katz-Tao and Bourgain-Glibichuk-Konyagin.
Van Vu and I have just uploaded to the arXiv our paper “Random matrices: The Universality phenomenon for Wigner ensembles“. This survey is a longer version (58 pages) of a previous short survey we wrote up a few months ago. The survey focuses on recent progress in understanding the universality phenomenon for Hermitian Wigner ensembles, of which the Gaussian Unitary Ensemble (GUE) is the most well known. The one-sentence summary of this progress is that many of the asymptotic spectral statistics (e.g. correlation functions, eigenvalue gaps, determinants, etc.) that were previously known for GUE matrices, are now known for very large classes of Wigner ensembles as well. There are however a wide variety of results of this type, due to the large number of interesting spectral statistics, the varying hypotheses placed on the ensemble, and the different modes of convergence studied, and it is difficult to isolate a single such result currently as the definitive universality result. (In particular, there is at present a tradeoff between generality of ensemble and strength of convergence; the universality results that are available for the most general classes of ensemble are only presently able to demonstrate a rather weak sense of convergence to the universal distribution (involving an additional averaging in the energy parameter), which limits the applicability of such results to a number of interesting questions in which energy averaging is not permissible, such as the study of the least singular value of a Wigner matrix, or of related quantities such as the condition number or determinant. But it is conceivable that this tradeoff is a temporary phenomenon and may be eliminated by future work in this area; in the case of Hermitian matrices whose entries have the same second moments as that of the GUE ensemble, for instance, the need for energy averaging has already been removed.)
Nevertheless, throughout the family of results that have been obtained recently, there are two main methods which have been fundamental to almost all of the recent progress in extending from special ensembles such as GUE to general ensembles. The first method, developed extensively by Erdos, Schlein, Yau, Yin, and others (and building on an initial breakthrough by Johansson), is the heat flow method, which exploits the rapid convergence to equilibrium of the spectral statistics of matrices undergoing Dyson-type flows towards GUE. (An important aspect to this method is the ability to accelerate the convergence to equilibrium by localising the Hamiltonian, in order to eliminate the slowest modes of the flow; this refinement of the method is known as the “local relaxation flow” method. Unfortunately, the translation mode is not accelerated by this process, which is the principal reason why results obtained by pure heat flow methods still require an energy averaging in the final conclusion; it would of interest to find a way around this difficulty.) The other method, which goes all the way back to Lindeberg in his classical proof of the central limit theorem, and which was introduced to random matrix theory by Chatterjee and then developed for the universality problem by Van Vu and myself, is the swapping method, which is based on the observation that spectral statistics of Wigner matrices tend to be stable if one replaces just one or two entries of the matrix with another distribution, with the stability of the swapping process becoming stronger if one assumes that the old and new entries have many matching moments. The main formalisations of this observation are known as four moment theorems, because they require four matching moments between the entries, although there are some variant three moment theorems and two moment theorems in the literature as well. Our initial four moment theorems were focused on individual eigenvalues (and later also to eigenvectors), but it was later observed by Erdos, Yau, and Yin that simpler four moment theorems could also be established for aggregate spectral statistics, such as the coefficients of the Greens function, and Knowles and Yin also subsequently observed that these latter theorems could be used to recover a four moment theorem for eigenvalues and eigenvectors, giving an alternate approach to proving such theorems.
Interestingly, it seems that the heat flow and swapping methods are complementary to each other; the heat flow methods are good at removing moment hypotheses on the coefficients, while the swapping methods are good at removing regularity hypotheses. To handle general ensembles with minimal moment or regularity hypotheses, it is thus necessary to combine the two methods (though perhaps in the future a third method, or a unification of the two existing methods, might emerge).
Besides the heat flow and swapping methods, there are also a number of other basic tools that are also needed in these results, such as local semicircle laws and eigenvalue rigidity, which are also discussed in the survey. We also survey how universality has been established for wide variety of spectral statistics; the -point correlation functions are the most well known of these statistics, but they do not tell the whole story (particularly if one can only control these functions after an averaging in the energy), and there are a number of other statistics, such as eigenvalue counting functions, determinants, or spectral gaps, for which the above methods can be applied.
In order to prevent the survey from becoming too enormous, we decided to restrict attention to Hermitian matrix ensembles, whose entries off the diagonal are identically distributed, as this is the case in which the strongest results are available. There are several results that are applicable to more general ensembles than these which are briefly mentioned in the survey, but they are not covered in detail.
We plan to submit this survey eventually to the proceedings of a workshop on random matrix theory, and will continue to update the references on the arXiv version until the time comes to actually submit the paper.
Finally, in the survey we issue some errata for previous papers of Van and myself in this area, mostly centering around the three moment theorem (a variant of the more widely used four moment theorem), for which the original proof of Van and myself was incomplete. (Fortunately, as the three moment theorem had many fewer applications than the four moment theorem, and most of the applications that it did have ended up being superseded by subsequent papers, the actual impact of this issue was limited, but still an erratum is in order.)
I’ve just uploaded to the arXiv my paper “Every odd number greater than 1 is the sum of at most five primes“, submitted to Mathematics of Computation. The main result of the paper is as stated in the title, and is in the spirit of (though significantly weaker than) the even Goldbach conjecture (every even natural number is the sum of at most two primes) and odd Goldbach conjecture (every odd natural number greater than 1 is the sum of at most three primes). It also improves on a result of Ramaré that every even natural number is the sum of at most six primes. This result had previously also been established by Kaniecki under the additional assumption of the Riemann hypothesis, so one can view the main result here as an unconditional version of Kaniecki’s result.
The method used is the Hardy-Littlewood circle method, which was for instance also used to prove Vinogradov’s theorem that every sufficiently large odd number is the sum of three primes. Let’s quickly recall how this argument works. It is convenient to use a proxy for the primes, such as the von Mangoldt function , which is mostly supported on the primes. To represent a large number
as the sum of three primes, it suffices to obtain a good lower bound for the sum
By Fourier analysis, one can rewrite this sum as an integral
where
and . To control this integral, one then needs good bounds on
for various values of
. To do this, one first approximates
by a rational
with controlled denominator (using a tool such as the Dirichlet approximation theorem)
. The analysis then broadly bifurcates into the major arc case when
is small, and the minor arc case when
is large. In the major arc case, the problem more or less boils down to understanding sums such as
which in turn is almost equivalent to understanding the prime number theorem in arithmetic progressions modulo . In the minor arc case, the prime number theorem is not strong enough to give good bounds (unless one is using some extremely strong hypotheses, such as the generalised Riemann hypothesis), so instead one uses a rather different method, using truncated versions of divisor sum identities such as
to split
into a collection of linear and bilinear sums that are more tractable to bound, typical examples of which (after using a particularly simple truncated divisor sum identity known as Vaughan’s identity) include the “Type I sum”
and the “Type II sum”
After using tools such as the triangle inequality or Cauchy-Schwarz inequality to eliminate arithmetic functions such as or
, one ends up controlling plain exponential sums such as
, which can be efficiently controlled in the minor arc case.
This argument works well when is extremely large, but starts running into problems for moderate sized
, e.g.
. The first issue is that of logarithmic losses in the minor arc estimates. A typical minor arc estimate takes the shape
when is close to
for some
. This only improves upon the trivial estimate
from the prime number theorem when
. As a consequence, it becomes necessary to obtain an accurate prime number theorem in arithmetic progressions with modulus as large as
. However, with current technology, the error term in such theorems are quite poor (terms such as
for some small
are typical, and there is also a notorious “Siegel zero” problem), and as a consequence, the method is generally only applicable for very large
. For instance, the best explicit result of Vinogradov type known currently is due to Liu and Wang, who established that all odd numbers larger than
are the sum of three odd primes. (However, on the assumption of the GRH, the full odd Goldbach conjecture is known to be true; this is a result of Deshouillers, Effinger, te Riele, and Zinoviev.)
In this paper, we make a number of refinements to the general scheme, each one of which is individually rather modest and not all that novel, but which when added together turn out to be enough to resolve the five primes problem (though many more ideas would still be needed to tackle the three primes problem, and as is well known the circle method is very unlikely to be the route to make progress on the two primes problem). The first refinement, which is only available in the five primes case, is to take advantage of the numerical verification of the even Goldbach conjecture up to some large (we take
, using a verification of Richstein, although there are now much larger values of
– as high as
– for which the conjecture has been verified). As such, instead of trying to represent an odd number
as the sum of five primes, we can represent it as the sum of three odd primes and a natural number between
and
. This effectively brings us back to the three primes problem, but with the significant additional boost that one can essentially restrict the frequency variable
to be of size
. In practice, this eliminates all of the major arcs except for the principal arc around
. This is a significant simplification, in particular avoiding the need to deal with the prime number theorem in arithmetic progressions (and all the attendant theory of L-functions, Siegel zeroes, etc.).
In a similar spirit, by taking advantage of the numerical verification of the Riemann hypothesis up to some height , and using the explicit formula relating the von Mangoldt function with the zeroes of the zeta function, one can safely deal with the principal major arc
. For our specific application, we use the value
, arising from the verification of the Riemann hypothesis of the first
zeroes by van de Lune (unpublished) and Wedeniswki. (Such verifications have since been extended further, the latest being that the first
zeroes lie on the line.)
To make the contribution of the major arc as efficient as possible, we borrow an idea from a paper of Bourgain, and restrict one of the three primes in the three-primes problem to a somewhat shorter range than the other two (of size instead of
, where we take
to be something like
), as this largely eliminates the “Archimedean” losses coming from trying to use Fourier methods to control convolutions on
. In our paper, we set the scale parameter
to be
(basically, anything that is much larger than
but much less than
will work), but we found that an additional gain (which we ended up not using) could be obtained by averaging
over a range of scales, say between
and
. This sort of averaging could be a useful trick in future work on Goldbach-type problems.
It remains to treat the contribution of the “minor arc” . To do this, one needs good
and
type estimates on the exponential sum
. Plancherel’s theorem gives an
estimate which loses a logarithmic factor, but it turns out that on this particular minor arc one can use tools from the theory of the large sieve (such as Montgomery’s uncertainty principle) to eliminate this logarithmic loss almost completely; it turns out that the most efficient way to do this is use an effective upper bound of Siebert on the number of prime pairs
less than
to obtain an
bound that only loses a factor of
(or of
, once one cuts out the major arc).
For estimates, it turns out that existing effective versions of (1) (in particular, the bound given by Chen and Wang) are insufficient, due to the three logarithmic factors of
in the bound. By using a smoothed out version
of the sum
, for some suitable cutoff function
, one can save one factor of a logarithm, obtaining a bound of the form
with effective constants. One can improve the constants further by restricting all summations to odd integers (which barely affects , since
was mostly supported on odd numbers anyway), which in practice reduces the effective constants by a factor of two or so. One can also make further improvements in the constants by using the very sharp large sieve inequality to control the “Type II” sums that arise from Vaughan’s identity, and by using integration by parts to improve the bounds on the “Type I” sums. A final gain can then be extracted by optimising the cutoff parameters
appearing in Vaughan’s identity to minimise the contribution of the Type II sums (which, in practice, are the dominant term). Combining all these improvements, one ends up with bounds of the shape
when is small (say
) and
when is large (say
). (See the paper for more explicit versions of these estimates.) The point here is that the
factors have been partially replaced by smaller logarithmic factors such as
or
. Putting together all of these improvements, one can finally obtain a satisfactory bound on the minor arc. (There are still some terms with a
factor in them, but we use the effective Vinogradov theorem of Liu and Wang to upper bound
by
, which ends up making the remaining terms involving
manageable.)
Recent Comments