You are currently browsing the monthly archive for February 2012.

In the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, namely quasirandomness and product theorems, leaving only the non-concentration ingredient to discuss. We can summarise the results of the last three notes, in the case of fields of prime order, as the following theorem.

Theorem 1 (Non-concentration implies expansion in )Let be a prime, let , and let be a symmetric set of elements in of cardinality not containing the identity. Write , and suppose that one has the non-concentration propertyfor some and some even integer . Then is a two-sided -expander for some depending only on .

*Proof:* From (1) we see that is not supported in any proper subgroup of , which implies that generates . The claim now follows from the Bourgain-Gamburd expansion machine (Theorem 2 of Notes 4), the product theorem (Theorem 1 of Notes 5), and quasirandomness (Exercise 8 of Notes 3).

Remark 1The same argument also works if we replace by the field of order for some bounded . However, there is a difficulty in the regime when is unbounded, because the quasirandomness property becomes too weak for the Bourgain-Gamburd expansion machine to be directly applicable. On theother hand, the above type of theorem was generalised to the setting of cyclic groups with square-free by Varju, to arbitrary by Bourgain and Varju, and to more general algebraic groups than and square-free by Salehi Golsefidy and Varju. It may be that some modification of the proof techniques in these papers may also be able to handle the field case with unbounded .

It thus remains to construct tools that can establish the non-concentration property (1). The situation is particularly simple in , as we have a good understanding of the subgroups of that group. Indeed, from Theorem 14 from Notes 5, we obtain the following corollary to Theorem 1:

Corollary 2 (Non-concentration implies expansion in )Let be a prime, and let be a symmetric set of elements in of cardinality not containing the identity. Write , and suppose that one has the non-concentration propertyfor some and some even integer , where ranges over all Borel subgroups of . Then, if is sufficiently large depending on , is a two-sided -expander for some depending only on .

It turns out (2) can be verified in many cases by exploiting the solvable nature of the Borel subgroups . We give two examples of this in these notes. The first result, due to Bourgain and Gamburd (with earlier partial results by Gamburd and by Shalom) generalises Selberg’s expander construction to the case when generates a thin subgroup of :

Theorem 3 (Expansion in thin subgroups)Let be a symmetric subset of not containing the identity, and suppose that the group generated by is not virtually solvable. Then as ranges over all sufficiently large primes, the Cayley graphs form a two-sided expander family, where is the usual projection.

Remark 2One corollary of Theorem 3 (or of the non-concentration estimate (3) below) is that generates for all sufficiently large , if is not virtually solvable. This is a special case of a much more general result, known as the strong approximation theorem, although this is certainly not the most direct way to prove such a theorem. Conversely, the strong approximation property is used in generalisations of this result to higher rank groups than .

Exercise 1In the converse direction, if is virtually solvable, show that for sufficiently large , fails to generate . (Hint:use Theorem 14 from Notes 5 to prevent from having bounded index solvable subgroups.)

Exercise 2 (Lubotzsky’s 1-2-3 problem)Let .

- (i) Show that generates a free subgroup of . (
Hint:use a ping-pong argument, as in Exercise 23 of Notes 2.)- (ii) Show that if are two distinct elements of the sector , then there os no element for which . (
Hint:this is another ping-pong argument.) Conclude that has infinite index in . (Contrast this with the situation in which the coefficients in are replaced by or , in which case is either all of , or a finite index subgroup, as demonstrated in Exercise 23 of Notes 2).- (iii) Show that for sufficiently large primes form a two-sided expander family.

Remark 3Theorem 3 has been generalised to arbitrary linear groups, and with replaced by for square-free ; see this paper of Salehi Golsefidy and Varju. In this more general setting, the condition of virtual solvability must be replaced by the condition that the connected component of the Zariski closure of is perfect. An effective version of Theorem 3 (with completely explicit constants) was recently obtained by Kowalski.

The second example concerns Cayley graphs constructed using random elements of .

Theorem 4 (Random generators expand)Let be a prime, and let be two elements of chosen uniformly at random. Then with probability , is a two-sided -expander for some absolute constant .

Remark 4As with Theorem 3, Theorem 4 has also been extended to a number of other groups, such as the Suzuki groups (in this paper of Breuillard, Green, and Tao), and more generally to finite simple groups of Lie type of bounded rank (in forthcoming work of Breuillard, Green, Guralnick, and Tao). There are a number of other constructions of expanding Cayley graphs in such groups (and in other interesting groups, such as the alternating groups) beyond those discussed in these notes; see this recent survey of Lubotzky for further discussion. It has been conjectured by Lubotzky and Weiss thatanypair of (say) that generates the group, is a two-sided -expander for an absolute constant : in the case of , this has been established for a density one set of primes by Breuillard and Gamburd.

** — 1. Expansion in thin subgroups — **

We now prove Theorem 3. The first observation is that the expansion property is monotone in the group :

Exercise 3Let be symmetric subsets of not containing the identity, such that . Suppose that is a two-sided expander family for sufficiently large primes . Show that is also a two-sided expander family.

As a consequence, Theorem 3 follows from the following two statments:

Theorem 5 (Tits alternative)Let be a group. Then exactly one of the following statements holds:

- (i) is virtually solvable.
- (ii) contains a copy of the free group of two generators as a subgroup.

Theorem 6 (Expansion in free groups)Let be generators of a free subgroup of . Then as ranges over all sufficiently large primes, the Cayley graphs form a two-sided expander family.

Theorem 5 is a special case of the famous Tits alternative, which among other things allows one to replace by for any and any field of characteristic zero (and fields of positive characteristic are also allowed, if one adds the requirement that be finitely generated). We will not prove the full Tits alternative here, but instead just give an *ad hoc* proof of the special case in Theorem 5 in the following exercise.

Exercise 4Given any matrix , the singular values are and , and we can apply the singular value decomposition to decomposewhere and are orthonormal bases. (When , these bases are uniquely determined up to phase rotation.) We let be the projection of to the projective complex plane, and similarly define .

Let be a subgroup of . Call a pair a

limit pointof if there exists a sequence with and .

- (i) Show that if is infinite, then there is at least one limit point.
- (ii) Show that if is a limit point, then so is .
- (iii) Show that if there are two limit points with , then there exist that generate a free group. (
Hint:Choose close to and close to , and consider the action of and on , and specifically on small neighbourhoods of , and set up a ping-pong type situation.)- (iv) Show that if is hyperbolic (i.e. it has an eigenvalue greater than 1), with eigenvectors , then the projectivisations of form a limit point. Similarly, if is regular parabolic (i.e. it has an eigenvalue at 1, but is not the identity) with eigenvector , show that is a limit point.
- (v) Show that if has no free subgroup of two generators, then all hyperbolic and regular parabolic elements of have a common eigenvector. Conclude that all such elements lie in a solvable subgroup of .
- (vi) Show that if an element is neither hyperbolic nor regular parabolic, and is not a multiple of the identity, then is conjugate to a rotation by (in particular, ).
- (vii) Establish Theorem 5. (
Hint:show that two square roots of in cannot multiply to another square root of .)

Now we prove Theorem 6. Let be a free subgroup of generated by two generators . Let be the probability measure generating a random walk on , thus is the corresponding generator on . By Corollary 2, it thus suffices to show that

for all sufficiently large , some absolute constant , and some even (depending on , of course), where ranges over Borel subgroups.

As is a homomorphism, one has and so it suffices to show that

To deal with the supremum here, we will use an argument of Bourgain and Gamburd, taking advantage of the fact that all Borel groups of obey a common group law, the point being that free groups such as obey such laws only very rarely. More precisely, we use the fact that the Borel groups are solvable of derived length two; in particular we have

for all . Now, is supported on matrices in whose coefficients have size (where we allow the implied constants to depend on the choice of generators ), and so is supported on matrices in whose coefficients also have size . If is less than a sufficiently small multiple of , these coefficients are then less than (say). As such, if lie in the support of and their projections obey the word law (4) in , then the original matrices obey the word law (4) in . (This lifting of identities from the characteristic setting of to the characteristic setting of is a simple example of the “Lefschetz principle”.)

To summarise, if we let be the set of all elements of that lie in the support of , then (4) holds for all . This severely limits the size of to only be of polynomial size, rather than exponential size:

Proposition 7Let be a subset of the support of (thus, consists of words in of length ) such that the law (4) holds for all . Then .

The proof of this proposition is laid out in the exercise below.

Exercise 5Let be a free group generated by two generators . Let be the set of all words of length at most in .

- (i) Show that if commute, then lie in the same cyclic group, thus for some and .
- (ii) Show that if , there are at most elements of that commute with .
- (iii) Show that if , there are at most elements of with .
- (iv) Prove Proposition 7.

Now we can conclude the proof of Theorem 3:

Exercise 6Let be a free group generated by two generators .

- (i) Show that for some absolute constant . (For much more precise information on , see this paper of Kesten.)
- (ii) Conclude the proof of Theorem 3.

** — 2. Random generators expand — **

We now prove Theorem 4. Let be the free group on two formal generators , and let be the generator of the random walk. For any word and any in a group , let be the element of formed by substituting for respectively in the word ; thus can be viewed as a map for any group . Observe that if is drawn randomly using the distribution , and , then is distributed according to the law , where . Applying Corollary 2, it suffices to show that whenever is a large prime and are chosen uniformly and independently at random from , that with probability , one has

for some absolute constant , where ranges over all Borel subgroups of and is drawn from the law for some even natural number .

Let denote the words in of length at most . We may use the law (4) to obtain good bound on the supremum in (5) assuming a certain non-degeneracy property of the word evaluations :

Exercise 7Let be a natural number, and suppose that is such that for . Show thatfor some absolute constant , where is drawn from the law . (

Hint:use (4) and the hypothesis to lift the problem up to , at which point one can use Proposition 7 and Exercise 6.)

In view of this exercise, it suffices to show that with probability , one has for all for some comparable to a small multiple of . As has elements, it thus suffices by the union bound to show that

for some absolute constant , and any of length less than for some sufficiently small absolute constant .

Let us now fix a non-identity word of length less than , and consider as a function from to for an arbitrary field . We can identify with the set . A routine induction then shows that the expression is then a polynomial in the eight variables of degree and coefficients which are integers of size . Let us then make the additional restriction to the case , in which case we can write and . Then is now a rational function of whose numerator is a polynomial of degree and coefficients of size , and the denominator is a monomial of of degree .

We then specialise this rational function to the field . It is conceivable that when one does so, the rational function collapses to the constant polynomial , thus for all with . (For instance, this would be the case if , by Lagrange’s theorem, if it were not for the fact that is far too large here.) But suppose that this rational function does not collapse to the constant rational function. Applying the Schwarz-Zippel lemma (Exercise 23 from Notes 5), we then see that the set of pairs with and is at most ; adding in the and cases, one still obtains a bound of , which is acceptable since and . Thus, the only remaining case to consider is when the rational function is identically on with .

Now we perform another “Lefschetz principle” maneuvre to change the underlying field. Recall that the denominator of rational function is monomial in , and the numerator has coefficients of size . If is less than for a sufficiently small , we conclude in particular (for large enough) that the coefficients all have magnitude less than . As such, the only way that this function can be identically on is if it is identically on for all with , and hence for or also by taking Zariski closures.

On the other hand, we know that for some choices of , e.g. , contains a copy of the free group on two generators (see e.g. Exercise 23 of Notes 2). As such, it is not possible for any non-identity word to be identically trivial on . Thus this case cannot actually occur, completing the proof of (6) and hence of Theorem 4.

Remark 5We see from the above argument that the existence of subgroups of an algebraic group with good “independence” properties – such as that of generating a free group – can be useful in studying the expansion properties of that algebraic group, even if the field of interest in the latter is distinct from that of the former. For more complicated algebraic groups than , in which laws such as (4) are not always available, it turns out to be useful to place further properties on the subgroup , for instance by requiring that all non-abelian subgroups of that group be Zariski dense (a property which has been calledstrong density), as this turns out to be useful for preventing random walks from concentrating in proper algebraic subgroups. See this paper of Breuillard, Guralnick, Green and Tao for constructions of strongly dense free subgroups of algebraic groups and further discussion.

It has been a little over two weeks now since the protest site at thecostofknowledge.com was set up to register declarations of non-cooperation with Reed Elsevier in protest of their research publishing practices, inspired by this blog post of Tim Gowers. Awareness of the protest has certainly grown in these two weeks; the number of signatories is now well over four thousand, across a broad array of academic disciplines, and the protest has been covered by many blogs and also the mainstream media (e.g. the Guardian, the Economist, Forbes, etc.), and even by Elsevier stock analysts. (Elsevier itself released an open letter responding to the protest here.) My interpretation of events is that there was a significant amount of latent or otherwise undisclosed dissatisfaction already with the publishing practices of Elsevier (and, to a lesser extent, some other commercial academic publishers), and a desire to support alternatives such as university or society publishers, and the more recent open access journals; and that this protest (and parallel protests, such as the movement to oppose the Research Works Act) served to drive these feelings out into the open.

The statement of the protest itself, though, is rather brief, reflecting the improvised manner in which the site was created. A group of mathematicians including myself therefore decided to write and sign a more detailed explanation of why we supported this protest, giving more background and references to support our position. The 34 signatories are Scott Aaronson, Douglas N. Arnold, Artur Avila, John Baez, Folkmar Bornemann, Danny Calegari, Henry Cohn, Ingrid Daubechies, Jordan Ellenberg, Matthew Emerton, Marie Farge, David Gabai, Timothy Gowers, Ben Green, Martin Grotschel, Michael Harris, Frederic Helein, Rob Kirby, Vincent Lafforgue, Gregory F. Lawler, Randall J. LeVeque, Laszlo Lovasz, Peter J. Olver, Olof Sisask, Richard Taylor, Bernard Teissier, Burt Totaro, Lloyd N. Trefethen, Takashi Tsuboi, Marie-France Vigneras, Wendelin Werner, Amie Wilkinson, Gunter M. Ziegler, and myself. (Note that while Daubechies is current president of the International Mathematical Union, Lovasz is a past president, and Grotschel is the current secretary, they are signing this letter as individuals and not as representatives of the IMU. Similarly for Trefethen and Arnold (current and past president of SIAM).)

Of course, the 34 of us do not presume to speak for the remaining four thousand signatories to the protest, but I hope that our statement is somewhat representative of the position of many of its supporters.

Further discussion of this statement can be found at this blog post of Tim Gowers.

EDIT: I think it is appropriate to quote the following excerpt from our statement:

All mathematicians must decide for themselves whether, or to what extent, they wish to participate in the boycott. Senior mathematicians who have signed the boycott bear some responsibility towards junior colleagues who are forgoing the option of publishing in Elsevier journals, and should do their best to help minimize any negative career consequences.

Whether or not you decide to join the boycott, there are some simple actions that everyone can take, which seem to us to be uncontroversial:

- Make sure that the final versions of all your papers, particularly new ones, are freely available online, ideally both on the arXiv and on your home page.
- If you are submitting a paper and there is a choice between an expensive journal and a cheap (or free) journal of the same standard, then always submit to the cheap one.

In the previous set of notes, we saw that one could derive expansion of Cayley graphs from three ingredients: non-concentration, product theorems, and quasirandomness. Quasirandomness was discussed in Notes 3. In the current set of notes, we discuss product theorems. Roughly speaking, these theorems assert that in certain circumstances, a finite subset of a group either exhibits expansion (in the sense that , say, is significantly larger than ), or is somehow “close to” or “trapped” by a genuine group.

Theorem 1 (Product theorem in )Let , let be a finite field, and let be a finite subset of . Let be sufficiently small depending on . Then at least one of the following statements holds:

- (Expansion) One has .
- (Close to ) One has .
- (Trapping) is contained in a proper subgroup of .

We will prove this theorem (which was proven first in the cases for fields of prime order by Helfgott, and then for and general by Dinai, and finally to general and independently by Pyber-Szabo and by Breuillard-Green-Tao) later in this notes. A more qualitative version of this proposition was also previously obtained by Hrushovski. There are also generalisations of the product theorem of importance to number theory, in which the field is replaced by a cyclic ring (with not necessarily prime); this was achieved first for and square-free by Bourgain, Gamburd, and Sarnak, by Varju for general and square-free, and finally by this paper of Bourgain and Varju for arbitrary and .

Exercise 1 (Girth bound)Assuming Theorem 1, show that whenever is a symmetric set of generators of for some finite field and some , then any element of can be expressed as the product of elements from . (Equivalently, if we add the identity element to , then for some .) This is a special case of a conjecture of Babai and Seress, who conjectured that the bound should hold uniformly for all finite simple groups (in particular, the implied constants here should not actually depend on . The methods used to handle the case can handle other finite groups of Lie type of bounded rank, but at present we do not have bounds that are independent of the rank. On the other hand, a recent paper of Helfgott and Seress has almost resolved the conjecture for the permutation groups .

A key tool to establish product theorems is an argument which is sometimes referred to as the *pivot argument*. To illustrate this argument, let us first discuss a much simpler (and older) theorem, essentially due to Freiman, which has a much weaker conclusion but is valid in any group :

Theorem 2 (Baby product theorem)Let be a group, and let be a finite non-empty subset of . Then one of the following statements hold:

- (Expansion) One has .
- (Close to a subgroup) is contained in a left-coset of a group with .

To prove this theorem, we suppose that the first conclusion does not hold, thus . Our task is then to place inside the left-coset of a fairly small group .

To do this, we take a group element , and consider the intersection . *A priori*, the size of this set could range from anywhere from to . However, we can use the hypothesis to obtain an important dichotomy, reminiscent of the classical fact that two cosets of a subgroup of are either identical or disjoint:

Proposition 3 (Dichotomy)If , then exactly one of the following occurs:

- (Non-involved case) is empty.
- (Involved case) .

*Proof:* Suppose we are not in the pivot case, so that is non-empty. Let be an element of , then and both lie in . The sets and then both lie in . As these sets have cardinality and lie in , which has cardinality less than , we conclude from the inclusion-exclusion formula that

But the left-hand side is equal to , and the claim follows.

The above proposition provides a clear separation between two types of elements : the “non-involved” elements, which have nothing to do with (in the sense that , and the “involved” elements, which have a lot to do with (in the sense that . The key point is that there is a significant “gap” between the non-involved and involved elements; there are no elements that are only “slightly involved”, in that and intersect a little but not a lot. It is this gap that will allow us to upgrade approximate structure to exact structure. Namely,

Proposition 4The set of involved elements is a finite group, and is equal to .

*Proof:* It is clear that the identity element is involved, and that if is involved then so is (since . Now suppose that are both involved. Then and have cardinality greater than and are both subsets of , and so have non-empty intersection. In particular, is non-empty, and so is non-empty. By Proposition 3, this makes involved. It is then clear that is a group.

If , then is non-empty, and so from Proposition 3 is involved. Conversely, if is involved, then . Thus we have as claimed. In particular, is finite.

Now we can quickly wrap up the proof of Theorem 2. By construction, for all ,which by double counting shows that . As , we see that is contained in a right coset of ; setting , we conclude that is contained in a left coset of . is a conjugate of , and so . If , then and both lie in and have cardinality , so must overlap; and so . Thus , and so , and Theorem 2 follows.

Exercise 2Show that the constant in Theorem 2 cannot be replaced by any larger constant.

Exercise 3Let be a finite non-empty set such that . Show that . (Hint:If , show that for some .)

Exercise 4Let be a finite non-empty set such that . Show that there is a finite group with and a group element such that and .

Below the fold, we give further examples of the pivot argument in other group-like situations, including Theorem 2 and also the “sum-product theorem” of Bourgain-Katz-Tao and Bourgain-Glibichuk-Konyagin.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: The Universality phenomenon for Wigner ensembles“. This survey is a longer version (58 pages) of a previous short survey we wrote up a few months ago. The survey focuses on recent progress in understanding the universality phenomenon for Hermitian Wigner ensembles, of which the Gaussian Unitary Ensemble (GUE) is the most well known. The one-sentence summary of this progress is that many of the asymptotic spectral statistics (e.g. correlation functions, eigenvalue gaps, determinants, etc.) that were previously known for GUE matrices, are now known for very large classes of Wigner ensembles as well. There are however a wide variety of results of this type, due to the large number of interesting spectral statistics, the varying hypotheses placed on the ensemble, and the different modes of convergence studied, and it is difficult to isolate a single such result currently as *the* definitive universality result. (In particular, there is at present a tradeoff between generality of ensemble and strength of convergence; the universality results that are available for the most general classes of ensemble are only presently able to demonstrate a rather weak sense of convergence to the universal distribution (involving an additional averaging in the energy parameter), which limits the applicability of such results to a number of interesting questions in which energy averaging is not permissible, such as the study of the least singular value of a Wigner matrix, or of related quantities such as the condition number or determinant. But it is conceivable that this tradeoff is a temporary phenomenon and may be eliminated by future work in this area; in the case of Hermitian matrices whose entries have the same second moments as that of the GUE ensemble, for instance, the need for energy averaging has already been removed.)

Nevertheless, throughout the family of results that have been obtained recently, there are two main methods which have been fundamental to almost all of the recent progress in extending from special ensembles such as GUE to general ensembles. The first method, developed extensively by Erdos, Schlein, Yau, Yin, and others (and building on an initial breakthrough by Johansson), is the *heat flow method*, which exploits the rapid convergence to equilibrium of the spectral statistics of matrices undergoing Dyson-type flows towards GUE. (An important aspect to this method is the ability to accelerate the convergence to equilibrium by localising the Hamiltonian, in order to eliminate the slowest modes of the flow; this refinement of the method is known as the “local relaxation flow” method. Unfortunately, the translation mode is not accelerated by this process, which is the principal reason why results obtained by pure heat flow methods still require an energy averaging in the final conclusion; it would of interest to find a way around this difficulty.) The other method, which goes all the way back to Lindeberg in his classical proof of the central limit theorem, and which was introduced to random matrix theory by Chatterjee and then developed for the universality problem by Van Vu and myself, is the *swapping method*, which is based on the observation that spectral statistics of Wigner matrices tend to be stable if one replaces just one or two entries of the matrix with another distribution, with the stability of the swapping process becoming stronger if one assumes that the old and new entries have many matching moments. The main formalisations of this observation are known as *four moment theorems*, because they require four matching moments between the entries, although there are some variant three moment theorems and two moment theorems in the literature as well. Our initial four moment theorems were focused on individual eigenvalues (and later also to eigenvectors), but it was later observed by Erdos, Yau, and Yin that simpler four moment theorems could also be established for aggregate spectral statistics, such as the coefficients of the Greens function, and Knowles and Yin also subsequently observed that these latter theorems could be used to recover a four moment theorem for eigenvalues and eigenvectors, giving an alternate approach to proving such theorems.

Interestingly, it seems that the heat flow and swapping methods are complementary to each other; the heat flow methods are good at removing moment hypotheses on the coefficients, while the swapping methods are good at removing regularity hypotheses. To handle general ensembles with minimal moment or regularity hypotheses, it is thus necessary to combine the two methods (though perhaps in the future a third method, or a unification of the two existing methods, might emerge).

Besides the heat flow and swapping methods, there are also a number of other basic tools that are also needed in these results, such as local semicircle laws and eigenvalue rigidity, which are also discussed in the survey. We also survey how universality has been established for wide variety of spectral statistics; the -point correlation functions are the most well known of these statistics, but they do not tell the whole story (particularly if one can only control these functions after an averaging in the energy), and there are a number of other statistics, such as eigenvalue counting functions, determinants, or spectral gaps, for which the above methods can be applied.

In order to prevent the survey from becoming too enormous, we decided to restrict attention to Hermitian matrix ensembles, whose entries off the diagonal are identically distributed, as this is the case in which the strongest results are available. There are several results that are applicable to more general ensembles than these which are briefly mentioned in the survey, but they are not covered in detail.

We plan to submit this survey eventually to the proceedings of a workshop on random matrix theory, and will continue to update the references on the arXiv version until the time comes to actually submit the paper.

Finally, in the survey we issue some errata for previous papers of Van and myself in this area, mostly centering around the three moment theorem (a variant of the more widely used four moment theorem), for which the original proof of Van and myself was incomplete. (Fortunately, as the three moment theorem had many fewer applications than the four moment theorem, and most of the applications that it did have ended up being superseded by subsequent papers, the actual impact of this issue was limited, but still an erratum is in order.)

I’ve just uploaded to the arXiv my paper “Every odd number greater than 1 is the sum of at most five primes“, submitted to Mathematics of Computation. The main result of the paper is as stated in the title, and is in the spirit of (though significantly weaker than) the even Goldbach conjecture (every even natural number is the sum of at most two primes) and odd Goldbach conjecture (every odd natural number greater than 1 is the sum of at most three primes). It also improves on a result of Ramaré that every even natural number is the sum of at most six primes. This result had previously also been established by Kaniecki under the additional assumption of the Riemann hypothesis, so one can view the main result here as an unconditional version of Kaniecki’s result.

The method used is the Hardy-Littlewood circle method, which was for instance also used to prove Vinogradov’s theorem that every sufficiently large odd number is the sum of three primes. Let’s quickly recall how this argument works. It is convenient to use a proxy for the primes, such as the von Mangoldt function , which is mostly supported on the primes. To represent a large number as the sum of three primes, it suffices to obtain a good lower bound for the sum

By Fourier analysis, one can rewrite this sum as an integral

where

and . To control this integral, one then needs good bounds on for various values of . To do this, one first approximates by a rational with controlled denominator (using a tool such as the Dirichlet approximation theorem) . The analysis then broadly bifurcates into the *major arc* case when is small, and the *minor arc* case when is large. In the major arc case, the problem more or less boils down to understanding sums such as

which in turn is almost equivalent to understanding the prime number theorem in arithmetic progressions modulo . In the minor arc case, the prime number theorem is not strong enough to give good bounds (unless one is using some extremely strong hypotheses, such as the generalised Riemann hypothesis), so instead one uses a rather different method, using truncated versions of divisor sum identities such as to split into a collection of linear and bilinear sums that are more tractable to bound, typical examples of which (after using a particularly simple truncated divisor sum identity known as Vaughan’s identity) include the “Type I sum”

and the “Type II sum”

After using tools such as the triangle inequality or Cauchy-Schwarz inequality to eliminate arithmetic functions such as or , one ends up controlling plain exponential sums such as , which can be efficiently controlled in the minor arc case.

This argument works well when is extremely large, but starts running into problems for moderate sized , e.g. . The first issue is that of logarithmic losses in the minor arc estimates. A typical minor arc estimate takes the shape

when is close to for some . This only improves upon the trivial estimate from the prime number theorem when . As a consequence, it becomes necessary to obtain an accurate prime number theorem in arithmetic progressions with modulus as large as . However, with current technology, the error term in such theorems are quite poor (terms such as for some small are typical, and there is also a notorious “Siegel zero” problem), and as a consequence, the method is generally only applicable for very large . For instance, the best explicit result of Vinogradov type known currently is due to Liu and Wang, who established that all odd numbers larger than are the sum of three odd primes. (However, on the assumption of the GRH, the full odd Goldbach conjecture is known to be true; this is a result of Deshouillers, Effinger, te Riele, and Zinoviev.)

In this paper, we make a number of refinements to the general scheme, each one of which is individually rather modest and not all that novel, but which when added together turn out to be enough to resolve the five primes problem (though many more ideas would still be needed to tackle the three primes problem, and as is well known the circle method is very unlikely to be the route to make progress on the two primes problem). The first refinement, which is only available in the five primes case, is to take advantage of the numerical verification of the even Goldbach conjecture up to some large (we take , using a verification of Richstein, although there are now much larger values of – as high as – for which the conjecture has been verified). As such, instead of trying to represent an odd number as the sum of five primes, we can represent it as the sum of three odd primes and a natural number between and . This effectively brings us back to the three primes problem, but with the significant additional boost that one can essentially restrict the frequency variable to be of size . In practice, this eliminates all of the major arcs except for the principal arc around . This is a significant simplification, in particular avoiding the need to deal with the prime number theorem in arithmetic progressions (and all the attendant theory of L-functions, Siegel zeroes, etc.).

In a similar spirit, by taking advantage of the numerical verification of the Riemann hypothesis up to some height , and using the explicit formula relating the von Mangoldt function with the zeroes of the zeta function, one can safely deal with the principal major arc . For our specific application, we use the value , arising from the verification of the Riemann hypothesis of the first zeroes by van de Lune (unpublished) and Wedeniswki. (Such verifications have since been extended further, the latest being that the first zeroes lie on the line.)

To make the contribution of the major arc as efficient as possible, we borrow an idea from a paper of Bourgain, and restrict one of the three primes in the three-primes problem to a somewhat shorter range than the other two (of size instead of , where we take to be something like ), as this largely eliminates the “Archimedean” losses coming from trying to use Fourier methods to control convolutions on . In our paper, we set the scale parameter to be (basically, anything that is much larger than but much less than will work), but we found that an additional gain (which we ended up not using) could be obtained by averaging over a range of scales, say between and . This sort of averaging could be a useful trick in future work on Goldbach-type problems.

It remains to treat the contribution of the “minor arc” . To do this, one needs good and type estimates on the exponential sum . Plancherel’s theorem gives an estimate which loses a logarithmic factor, but it turns out that on this particular minor arc one can use tools from the theory of the large sieve (such as Montgomery’s uncertainty principle) to eliminate this logarithmic loss almost completely; it turns out that the most efficient way to do this is use an effective upper bound of Siebert on the number of prime pairs less than to obtain an bound that only loses a factor of (or of , once one cuts out the major arc).

For estimates, it turns out that existing effective versions of (1) (in particular, the bound given by Chen and Wang) are insufficient, due to the three logarithmic factors of in the bound. By using a smoothed out version of the sum , for some suitable cutoff function , one can save one factor of a logarithm, obtaining a bound of the form

with effective constants. One can improve the constants further by restricting all summations to odd integers (which barely affects , since was mostly supported on odd numbers anyway), which in practice reduces the effective constants by a factor of two or so. One can also make further improvements in the constants by using the very sharp *large sieve inequality* to control the “Type II” sums that arise from Vaughan’s identity, and by using integration by parts to improve the bounds on the “Type I” sums. A final gain can then be extracted by optimising the cutoff parameters appearing in Vaughan’s identity to minimise the contribution of the Type II sums (which, in practice, are the dominant term). Combining all these improvements, one ends up with bounds of the shape

when is small (say ) and

when is large (say ). (See the paper for more explicit versions of these estimates.) The point here is that the factors have been partially replaced by smaller logarithmic factors such as or . Putting together all of these improvements, one can finally obtain a satisfactory bound on the minor arc. (There are still some terms with a factor in them, but we use the effective Vinogradov theorem of Liu and Wang to upper bound by , which ends up making the remaining terms involving manageable.)

## Recent Comments