You are currently browsing the category archive for the ‘paper’ category.

Kevin Ford, Ben Green, Sergei Konyagin, James Maynard, and I have just uploaded to the arXiv our paper “Long gaps between primes“. This is a followup work to our two previous papers (discussed in this previous post), in which we had simultaneously shown that the maximal gap

\displaystyle  G(X) := \sup_{p_n, p_{n+1} \leq X} p_{n+1}-p_n

between primes up to {X} exhibited a lower bound of the shape

\displaystyle  G(X) \geq f(X) \log X \frac{\log \log X \log\log\log\log X}{(\log\log\log X)^2} \ \ \ \ \ (1)

for some function {f(X)} that went to infinity as {X \rightarrow \infty}; this improved upon previous work of Rankin and other authors, who established the same bound but with {f(X)} replaced by a constant. (Again, see the previous post for a more detailed discussion.)

In our previous papers, we did not specify a particular growth rate for {f(X)}. In my paper with Kevin, Ben, and Sergei, there was a good reason for this: our argument relied (amongst other things) on the inverse conjecture on the Gowers norms, as well as the Siegel-Walfisz theorem, and the known proofs of both results both have ineffective constants, rendering our growth function {f(X)} similarly ineffective. Maynard’s approach ostensibly also relies on the Siegel-Walfisz theorem, but (as shown in another recent paper of his) can be made quite effective, even when tracking {k}-tuples of fairly large size (about {\log^c x} for some small {c}). If one carefully makes all the bounds in Maynard’s argument quantitative, one eventually ends up with a growth rate {f(X)} of shape

\displaystyle  f(X) \asymp \frac{\log \log \log X}{\log\log\log\log X}, \ \ \ \ \ (2)

thus leading to a bound

\displaystyle  G(X) \gg \log X \frac{\log \log X}{\log\log\log X}

on the gaps between primes for large {X}; this is an unpublished calculation of James’.

In this paper we make a further refinement of this calculation to obtain a growth rate

\displaystyle  f(X) \asymp \log \log \log X \ \ \ \ \ (3)

leading to a bound of the form

\displaystyle  G(X) \geq c \log X \frac{\log \log X \log\log\log\log X}{\log\log\log X} \ \ \ \ \ (4)

for large {X} and some small constant {c}. Furthermore, this appears to be the limit of current technology (in particular, falling short of Cramer’s conjecture that {G(X)} is comparable to {\log^2 X}); in the spirit of Erdös’ original prize on this problem, I would like to offer 10,000 USD for anyone who can show (in a refereed publication, of course) that the constant {c} here can be replaced by an arbitrarily large constant {C}.

The reason for the growth rate (3) is as follows. After following the sieving process discussed in the previous post, the problem comes down to something like the following: can one sieve out all (or almost all) of the primes in {[x,y]} by removing one residue class modulo {p} for all primes {p} in (say) {[x/4,x/2]}? Very roughly speaking, if one can solve this problem with {y = g(x) x}, then one can obtain a growth rate on {f(X)} of the shape {f(X) \sim g(\log X)}. (This is an oversimplification, as one actually has to sieve out a random subset of the primes, rather than all the primes in {[x,y]}, but never mind this detail for now.)

Using the quantitative “dense clusters of primes” machinery of Maynard, one can find lots of {k}-tuples in {[x,y]} which contain at least {\gg \log k} primes, for {k} as large as {\log^c x} or so (so that {\log k} is about {\log\log x}). By considering {k}-tuples in arithmetic progression, this means that one can find lots of residue classes modulo a given prime {p} in {[x/4,x/2]} that capture about {\log\log x} primes. In principle, this means that union of all these residue classes can cover about {\frac{x}{\log x} \log\log x} primes, allowing one to take {g(x)} as large as {\log\log x}, which corresponds to (3). However, there is a catch: the residue classes for different primes {p} may collide with each other, reducing the efficiency of the covering. In our previous papers on the subject, we selected the residue classes randomly, which meant that we had to insert an additional logarithmic safety margin in expected number of times each prime would be shifted out by one of the residue classes, in order to guarantee that we would (with high probability) sift out most of the primes. This additional safety margin is ultimately responsible for the {\log\log\log\log X} loss in (2).

The main innovation of this paper, beyond detailing James’ unpublished calculations, is to use ideas from the literature on efficient hypergraph covering, to avoid the need for a logarithmic safety margin. The hypergraph covering problem, roughly speaking, is to try to cover a set of {n} vertices using as few “edges” from a given hypergraph {H} as possible. If each edge has {m} vertices, then one certainly needs at least {n/m} edges to cover all the vertices, and the question is to see if one can come close to attaining this bound given some reasonable uniform distribution hypotheses on the hypergraph {H}. As before, random methods tend to require something like {\frac{n}{m} \log r} edges before one expects to cover, say {1-1/r} of the vertices.

However, it turns out (under reasonable hypotheses on {H}) to eliminate this logarithmic loss, by using what is now known as the “semi-random method” or the “Rödl nibble”. The idea is to randomly select a small number of edges (a first “nibble”) – small enough that the edges are unlikely to overlap much with each other, thus obtaining maximal efficiency. Then, one pauses to remove all the edges from {H} that intersect edges from this first nibble, so that all remaining edges will not overlap with the existing edges. One then randomly selects another small number of edges (a second “nibble”), and repeats this process until enough nibbles are taken to cover most of the vertices. Remarkably, it turns out that under some reasonable assumptions on the hypergraph {H}, one can maintain control on the uniform distribution of the edges throughout the nibbling process, and obtain an efficient hypergraph covering. This strategy was carried out in detail in an influential paper of Pippenger and Spencer.

In our setup, the vertices are the primes in {[x,y]}, and the edges are the intersection of the primes with various residue classes. (Technically, we have to work with a family of hypergraphs indexed by a prime {p}, rather than a single hypergraph, but let me ignore this minor technical detail.) The semi-random method would in principle eliminate the logarithmic loss and recover the bound (3). However, there is a catch: the analysis of Pippenger and Spencer relies heavily on the assumption that the hypergraph is uniform, that is to say all edges have the same size. In our context, this requirement would mean that each residue class captures exactly the same number of primes, which is not the case; we only control the number of primes in an average sense, but we were unable to obtain any concentration of measure to come close to verifying this hypothesis. And indeed, the semi-random method, when applied naively, does not work well with edges of variable size – the problem is that edges of large size are much more likely to be eliminated after each nibble than edges of small size, since they have many more vertices that could overlap with the previous nibbles. Since the large edges are clearly the more useful ones for the covering problem than small ones, this bias towards eliminating large edges significantly reduces the efficiency of the semi-random method (and also greatly complicates the analysis of that method).

Our solution to this is to iteratively reweight the probability distribution on edges after each nibble to compensate for this bias effect, giving larger edges a greater weight than smaller edges. It turns out that there is a natural way to do this reweighting that allows one to repeat the Pippenger-Spencer analysis in the presence of edges of variable size, and this ultimately allows us to recover the full growth rate (3).

To go beyond (3), one either has to find a lot of residue classes that can capture significantly more than {\log\log x} primes of size {x} (which is the limit of the multidimensional Selberg sieve of Maynard and myself), or else one has to find a very different method to produce large gaps between primes than the Erdös-Rankin method, which is the method used in all previous work on the subject.

It turns out that the arguments in this paper can be combined with the Maier matrix method to also produce chains of consecutive large prime gaps whose size is of the order of (4); three of us (Kevin, James, and myself) will detail this in a future paper. (A similar combination was also recently observed in connection with our earlier result (1) by Pintz, but there are some additional technical wrinkles required to recover the full gain of (3) for the chains of large gaps problem.)

Van Vu and I have just uploaded to the arXiv our paper “Random matrices have simple eigenvalues“. Recall that an {n \times n} Hermitian matrix is said to have simple eigenvalues if all of its {n} eigenvalues are distinct. This is a very typical property of matrices to have: for instance, as discussed in this previous post, in the space of all {n \times n} Hermitian matrices, the space of matrices without all eigenvalues simple has codimension three, and for real symmetric cases this space has codimension two. In particular, given any random matrix ensemble of Hermitian or real symmetric matrices with an absolutely continuous distribution, we conclude that random matrices drawn from this ensemble will almost surely have simple eigenvalues.

For discrete random matrix ensembles, though, the above argument breaks down, even though general universality heuristics predict that the statistics of discrete ensembles should behave similarly to those of continuous ensembles. A model case here is the adjacency matrix {M_n} of an Erdös-Rényi graph – a graph on {n} vertices in which any pair of vertices has an independent probability {p} of being in the graph. For the purposes of this paper one should view {p} as fixed, e.g. {p=1/2}, while {n} is an asymptotic parameter going to infinity. In this context, our main result is the following (answering a question of Babai):

Theorem 1 With probability {1-o(1)}, {M_n} has simple eigenvalues.

Our argument works for more general Wigner-type matrix ensembles, but for sake of illustration we will stick with the Erdös-Renyi case. Previous work on local universality for such matrix models (e.g. the work of Erdos, Knowles, Yau, and Yin) was able to show that any individual eigenvalue gap {\lambda_{i+1}(M)-\lambda_i(M)} did not vanish with probability {1-o(1)} (in fact {1-O(n^{-c})} for some absolute constant {c>0}), but because there are {n} different gaps that one has to simultaneously ensure to be non-zero, this did not give Theorem 1 as one is forced to apply the union bound.

Our argument in fact gives simplicity of the spectrum with probability {1-O(n^{-A})} for any fixed {A}; in a subsequent paper we also show that it gives a quantitative lower bound on the eigenvalue gaps (analogous to how many results on the singularity probability of random matrices can be upgraded to a bound on the least singular value).

The basic idea of argument can be sketched as follows. Suppose that {M_n} has a repeated eigenvalue {\lambda}. We split

\displaystyle  M_n = \begin{pmatrix} M_{n-1} & X \\ X^T & 0 \end{pmatrix}

for a random {n-1 \times n-1} minor {M_{n-1}} and a random sign vector {X}; crucially, {X} and {M_{n-1}} are independent. If {M_n} has a repeated eigenvalue {\lambda}, then by the Cauchy interlacing law, {M_{n-1}} also has an eigenvalue {\lambda}. We now write down the eigenvector equation for {M_n} at {\lambda}:

\displaystyle  \begin{pmatrix} M_{n-1} & X \\ X^T & 0 \end{pmatrix} \begin{pmatrix} v \\ a \end{pmatrix} = \lambda \begin{pmatrix} v \\ a \end{pmatrix}.

Extracting the top {n-1} coefficients, we obtain

\displaystyle  (M_{n-1} - \lambda) v + a X = 0.

If we let {w} be the {\lambda}-eigenvector of {M_{n-1}}, then by taking inner products with {w} we conclude that

\displaystyle  a (w \cdot X) = 0;

we typically expect {a} to be non-zero, in which case we arrive at

\displaystyle  w \cdot X = 0.

In other words, in order for {M_n} to have a repeated eigenvalue, the top right column {X} of {M_n} has to be orthogonal to an eigenvector {w} of the minor {M_{n-1}}. Note that {X} and {w} are going to be independent (once we specify which eigenvector of {M_{n-1}} to take as {w}). On the other hand, thanks to inverse Littlewood-Offord theory (specifically, we use an inverse Littlewood-Offord theorem of Nguyen and Vu), we know that the vector {X} is unlikely to be orthogonal to any given vector {w} independent of {X}, unless the coefficients of {w} are extremely special (specifically, that most of them lie in a generalised arithmetic progression). The main remaining difficulty is then to show that eigenvectors of a random matrix are typically not of this special form, and this relies on a conditioning argument originally used by Komlós to bound the singularity probability of a random sign matrix. (Basically, if an eigenvector has this special form, then one can use a fraction of the rows and columns of the random matrix to determine the eigenvector completely, while still preserving enough randomness in the remaining portion of the matrix so that this vector will in fact not be an eigenvector with high probability.)

I’ve just uploaded to the arXiv my paper “The Elliott-Halberstam conjecture implies the Vinogradov least quadratic nonresidue conjecture“. As the title suggests, this paper links together the Elliott-Halberstam conjecture from sieve theory with the conjecture of Vinogradov concerning the least quadratic nonresidue {n(p)} of a prime {p}. Vinogradov established the bound

\displaystyle  n(p) \ll p^{\frac{1}{2\sqrt{e}}} \log^2 p \ \ \ \ \ (1)

and conjectured that

\displaystyle  n(p) \ll p^\varepsilon \ \ \ \ \ (2)

for any fixed {\varepsilon>0}. Unconditionally, the best result so far (up to logarithmic factors) that holds for all primes {p} is by Burgess, who showed that

\displaystyle  n(p) \ll p^{\frac{1}{4\sqrt{e}}+\varepsilon} \ \ \ \ \ (3)

for any fixed {\varepsilon>0}. See this previous post for a proof of these bounds.

In this paper, we show that the Vinogradov conjecture is a consequence of the Elliott-Halberstam conjecture. Using a variant of the argument, we also show that the “Type II” estimates established by Zhang and numerically improved by the Polymath8a project can be used to improve a little on the Vinogradov bound (1), to

\displaystyle  n(p) \ll p^{(\frac{1}{2}-\frac{1}{68})\frac{1}{\sqrt{e}} + \varepsilon},

although this falls well short of the Burgess bound. However, the method is somewhat different (although in both cases it is the Weil exponential sum bounds that are the source of the gain over (1)) and it is conceivable that a numerically stronger version of the Type II estimates could obtain results that are more competitive with the Burgess bound. At any rate, this demonstrates that the equidistribution estimates introduced by Zhang may have further applications beyond the family of results related to bounded gaps between primes.

The connection between the least quadratic nonresidue problem and Elliott-Halberstam is follows. Suppose for contradiction we can find a prime {q} with {n(q)} unusually large. Letting {\chi} be the quadratic character modulo {q}, this implies that the sums {\sum_{n \leq x} \chi(n)} are also unusually large for a significant range of {x} (e.g. {x < n(q)}), although the sum is also quite small for large {x} (e.g. {x > q}), due to the cancellation present in {\chi}. It turns out (by a sort of “uncertainty principle” for multiplicative functions, as per this paper of Granville and Soundararajan) that these facts force {\sum_{n\leq x} \chi(n) \Lambda(n)} to be unusually large in magnitude for some large {x} (with {q^C \leq x \leq q^{C'}} for two large absolute constants {C,C'}). By the periodicity of {\chi}, this means that

\displaystyle  \sum_{n\leq x} \chi(n) \Lambda(n+q)

must be unusually large also. However, because {n(q)} is large, one can factorise {\chi} as {f * 1} for a fairly sparsely supported function {f = \chi * \mu}. The Elliott-Halberstam conjecture, which controls the distribution of {\Lambda} in arithmetic progressions on the average can then be used to show that {\sum_{n \leq x} (f*1)(n) \Lambda(n+q)} is small, giving the required contradiction.

The implication involving Type II estimates is proven by a variant of the argument. If {n(q)} is large, then a character sum {\sum_{N\leq n \leq 2N} \chi(n)} is unusually large for a certain {N}. By multiplicativity of {\chi}, this shows that {\chi} correlates with {\chi * 1_{[N,2N]}}, and then by periodicity of {\chi}, this shows that {\chi(n)} correlates with {\chi * 1_{[N,2N]}(n+jq)} for various small {q}. By the Cauchy-Schwarz inequality (cf. this previous blog post), this implies that {\chi * 1_{[N,2N]}(n+jq)} correlates with {\chi * 1_{[N,2N]}(n+j'q)} for some distinct {j,j'}. But this can be ruled out by using Type II estimates.

I’ll record here a well-known observation concerning potential counterexamples to any improvement to the Burgess bound, that is to say an infinite sequence of primes {p} with {n(p) = p^{\frac{1}{4\sqrt{e}} + o(1)}}. Suppose we let {a(t)} be the asymptotic mean value of the quadratic character {\chi} at {p^t} and {b(t)} the mean value of {\chi \Lambda}; these quantities are defined precisely in my paper, but roughly speaking one can think of

\displaystyle  a(t) = \lim_{p \rightarrow \infty} \frac{1}{p^t} \sum_{n \leq p^t} \chi(n)


\displaystyle  b(t) = \lim_{p \rightarrow \infty} \frac{1}{p^t} \sum_{n \leq p^t} \chi(n) \Lambda(n).

Thanks to the basic Dirichlet convolution identity {\chi(n) \log(n) = \chi * \chi\Lambda(n)}, one can establish the Wirsing integral equation

\displaystyle  t a(t) = \int_0^t a(u) b(t-u)\ du

for all {t \geq 0}; see my paper for details (actually far sharper claims than this appear in previous work of Wirsing and Granville-Soundararajan). If we have an infinite sequence of counterexamples to any improvement to the Burgess bound, then we have

\displaystyle  a(t)=b(t) = 1 \hbox{ for } t < \frac{1}{4\sqrt{e}}

while from the Burgess exponential sum estimates we have

\displaystyle  a(t) = 0 \hbox{ for } t > \frac{1}{4}.

These two constraints, together with the Wirsing integral equation, end up determining {a} and {b} completely. It turns out that we must have

\displaystyle  b(t) = -1 \hbox{ for } \frac{1}{4\sqrt{e}} \leq t \leq \frac{1}{4}


\displaystyle  a(t) = 1 - 2 \log(4 \sqrt{e} t) \hbox{ for } \frac{1}{4\sqrt{e}} \leq t \leq \frac{1}{4}

and then for {t > \frac{1}{4}}, {b} evolves by the integral equation

\displaystyle  b(t) = \int_{1/4\sqrt{e}}^{1/4} b(t-u) \frac{2du}{u}.

For instance

\displaystyle  b(t) = 1 \hbox{ for } \frac{1}{4} < t \leq \frac{1}{2\sqrt{e}}

and then {b} oscillates in a somewhat strange fashion towards zero as {t \rightarrow \infty}. This very odd behaviour of {\sum_n \chi(n) \Lambda(n)} is surely impossible, but it seems remarkably hard to exclude it without invoking a strong hypothesis, such as GRH or the Elliott-Halberstam conjecture (or weaker versions thereof).

Tamar Ziegler and I have just uploaded to the arXiv our paper “Narrow progressions in the primes“, submitted to the special issue “Analytic Number Theory” in honor of the 60th birthday of Helmut Maier. The results here are vaguely reminiscent of the recent progress on bounded gaps in the primes, but use different methods.

About a decade ago, Ben Green and I showed that the primes contained arbitrarily long arithmetic progressions: given any {k}, one could find a progression {n, n+r, \dots, n+(k-1)r} with {r>0} consisting entirely of primes. In fact we showed the same statement was true if the primes were replaced by any subset of the primes of positive relative density.

A little while later, Tamar Ziegler and I obtained the following generalisation: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes. Furthermore, we could make this progression somewhat “narrow” by taking {r = n^{o(1)}} (where {o(1)} denotes a quantity that goes to zero as {n} goes to infinity). Again, the same statement also applies if the primes were replaced by a subset of positive relative density. My previous result with Ben corresponds to the linear case {P_i(r) = (i-1)r}.

In this paper we were able to make the progressions a bit narrower still: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes, and such that {r \leq \log^L n}, where {L} depends only on {k} and {P_1,\dots,P_k} (in fact it depends only on {k} and the degrees of {P_1,\dots,P_k}). The result is still true if the primes are replaced by a subset of positive density {\delta}, but unfortunately in our arguments we must then let {L} depend on {\delta}. However, in the linear case {P_i(r) = (i-1)r}, we were able to make {L} independent of {\delta} (although it is still somewhat large, of the order of {k 2^k}).

The polylogarithmic factor is somewhat necessary: using an upper bound sieve, one can easily construct a subset of the primes of density, say, {90\%}, whose arithmetic progressions {n,n+r,\dots,n+(k-1)r} of length {k} all obey the lower bound {r \gg \log^{k-1} n}. On the other hand, the prime tuples conjecture predicts that if one works with the actual primes rather than dense subsets of the primes, then one should have infinitely many length {k} arithmetic progressions of bounded width for any fixed {k}. The {k=2} case of this is precisely the celebrated theorem of Yitang Zhang that was the focus of the recently concluded Polymath8 project here. The higher {k} case is conjecturally true, but appears to be out of reach of known methods. (Using the multidimensional Selberg sieve of Maynard, one can get {m} primes inside an interval of length {O( \exp(O(m)) )}, but this is such a sparse set of primes that one would not expect to find even a progression of length three within such an interval.)

The argument in the previous paper was unable to obtain a polylogarithmic bound on the width of the progressions, due to the reliance on a certain technical “correlation condition” on a certain Selberg sieve weight {\nu}. This correlation condition required one to control arbitrarily long correlations of {\nu}, which was not compatible with a bounded value of {L} (particularly if one wanted to keep {L} independent of {\delta}).

However, thanks to recent advances in this area by Conlon, Fox, and Zhao (who introduced a very nice “densification” technique), it is now possible (in principle, at least) to delete this correlation condition from the arguments. Conlon-Fox-Zhao did this for my original theorem with Ben; and in the current paper we apply the densification method to our previous argument to similarly remove the correlation condition. This method does not fully eliminate the need to control arbitrarily long correlations, but allows most of the factors in such a long correlation to be bounded, rather than merely controlled by an unbounded weight such as {\nu}. This turns out to be significantly easier to control, although in the non-linear case we still unfortunately had to make {L} large compared to {\delta} due to a certain “clearing denominators” step arising from the complicated nature of the Gowers-type uniformity norms that we were using to control polynomial averages. We believe though that this an artefact of our method, and one should be able to prove our theorem with an {L} that is uniform in {\delta}.

Here is a simple instance of the densification trick in action. Suppose that one wishes to establish an estimate of the form

\displaystyle  {\bf E}_n {\bf E}_r f(n) g(n+r) h(n+r^2) = o(1) \ \ \ \ \ (1)

for some real-valued functions {f,g,h} which are bounded in magnitude by a weight function {\nu}, but which are not expected to be bounded; this average will naturally arise when trying to locate the pattern {(n,n+r,n+r^2)} in a set such as the primes. Here I will be vague as to exactly what range the parameters {n,r} are being averaged over. Suppose that the factor {g} (say) has enough uniformity that one can already show a smallness bound

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) H(n+r^2) = o(1) \ \ \ \ \ (2)

whenever {F, H} are bounded functions. (One should think of {F,H} as being like the indicator functions of “dense” sets, in contrast to {f,h} which are like the normalised indicator functions of “sparse” sets). The bound (2) cannot be directly applied to control (1) because of the unbounded (or “sparse”) nature of {f} and {h}. However one can “densify” {f} and {h} as follows. Since {f} is bounded in magnitude by {\nu}, we can bound the left-hand side of (1) as

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |.

The weight function {\nu} will be normalised so that {{\bf E}_n \nu(n) = O(1)}, so by the Cauchy-Schwarz inequality it suffices to show that

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

The left-hand side expands as

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s \nu(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2).

Now, it turns out that after an enormous (but finite) number of applications of the Cauchy-Schwarz inequality to steadily eliminate the {g,h} factors, as well as a certain “polynomial forms condition” hypothesis on {\nu}, one can show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s (\nu-1)(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

(Because of the polynomial shifts, this requires a method known as “PET induction”, but let me skip over this point here.) In view of this estimate, we now just need to show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

Now we can reverse the previous steps. First, we collapse back to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

One can bound {|{\bf E}_r g(n+r) h(n+r^2)|} by {{\bf E}_r \nu(n+r) \nu(n+r^2)}, which can be shown to be “bounded on average” in a suitable sense (e.g. bounded {L^4} norm) via the aforementioned polynomial forms condition. Because of this and the Hölder inequality, the above estimate is equivalent to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) | = o(1).

By setting {F} to be the signum of {{\bf E}_r g(n+r) h(n+r^2)}, this is equivalent to

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) h(n+r^2) = o(1).

This is halfway between (1) and (2); the sparsely supported function {f} has been replaced by its “densification” {F}, but we have not yet densified {h} to {H}. However, one can shift {n} by {r^2} and repeat the above arguments to achieve a similar densificiation of {h}, at which point one has reduced (1) to (2).

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap {p_{n+1}-p_n}. Here, we wish to consider the largest prime gap {G(X) = p_{n+1}-p_n} that one can find in the interval {[X] = \{1,\dots,X\}} as {X} goes to infinity.

Finding lower bounds on {G(X)} is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number {n}, the consecutive numbers {n!+2, n!+3,\dots,n!+n} are all composite, because each {n!+i}, {i=2,\dots,n} is divisible by some prime {p \leq n}, while being strictly larger than that prime {p}. From this and Stirling’s formula, it is not difficult to obtain the bound

\displaystyle G(X) \gg \frac{\log X}{\log\log X}. \ \ \ \ \ (1)


A more efficient bound comes from the prime number theorem: there are only {(1+o(1)) \frac{X}{\log X}} primes up to {X}, so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to {X} of length at least {(1-o(1)) \log X}, thus

\displaystyle G(X) \gtrsim \log X \ \ \ \ \ (2)


where we use {X \gtrsim Y} or {Y \lesssim X} as shorthand for {X \geq (1-o(1)) Y} or {Y \leq (1+o(1)) X}.

What about upper bounds? The Cramér random model predicts that the primes up to {X} are distributed like a random subset {\{1,\dots,X\}} of density {1/\log X}. Using this model, Cramér arrived at the conjecture

\displaystyle G(X) \ll \log^2 X.

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

\displaystyle G(X) \sim \log^2 X.

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

\displaystyle G(X) \gtrsim 2e^{-\gamma} \log^2 X

(note that {2e^{-\gamma} = 1.1229\dots} is slightly larger than {1}). For comparison, the known upper bounds on {G(X)} are quite weak; unconditionally one has {G(X) \ll X^{0.525}} by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to {G(X) \ll X^{1/2} \log X}, as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

\displaystyle G(X) \gg \frac{\log\log\log X}{\log\log\log\log X} \log X ,

which Erdös in 1935 improved to

\displaystyle G(X) \gg \frac{\log\log X}{(\log\log\log X)^2} \log X

and Rankin in 1938 improved slightly further to

\displaystyle G(X) \gtrsim c \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X \ \ \ \ \ (3)


with {c=1/3}. Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant {c}, which was raised to {c=\frac{1}{2} e^\gamma} in 1963 by Schönhage, to {c= e^\gamma} in 1963 by Rankin, to {c = 1.31256 e^\gamma} by Maier and Pomerance, and finally to {c = 2e^\gamma} in 1997 by Pintz.

Erdös listed the problem of making {c} arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1 The bound (3) holds for arbitrarily large {c>0}.

In principle, we thus have a bound of the form

\displaystyle G(X) \geq f(X) \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X

for some {f(X)} that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on {f(X)} at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of {\log\log\log X} in the denominator is {3} instead of {2}.) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

Read the rest of this entry »

I’ve just uploaded to the arXiv the D.H.J. Polymath paper “Variants of the Selberg sieve, and bounded intervals containing many primes“, which is the second paper to be produced from the Polymath8 project (the first one being discussed here). We’ll refer to this latter paper here as the Polymath8b paper, and the former as the Polymath8a paper. As with Polymath8a, the Polymath8b paper is concerned with the smallest asymptotic prime gap

\displaystyle H_1 := \liminf_{n \rightarrow \infty}(p_{n+1}-p_n),

where {p_n} denotes the {n^{th}} prime, as well as the more general quantities

\displaystyle H_m := \liminf_{n \rightarrow \infty}(p_{n+m}-p_n).

In the breakthrough paper of Goldston, Pintz, and Yildirim, the bound {H_1 \leq 16} was obtained under the strong hypothesis of the Elliott-Halberstam conjecture. An unconditional bound on {H_1}, however, remained elusive until the celebrated work of Zhang last year, who showed that

\displaystyle H_1 \leq 70{,}000{,}000.

The Polymath8a paper then improved this to {H_1 \leq 4{,}680}. After that, Maynard introduced a new multidimensional Selberg sieve argument that gave the substantial improvement

\displaystyle H_1 \leq 600

unconditionally, and {H_1 \leq 12} on the Elliott-Halberstam conjecture; furthermore, bounds on {H_m} for higher {m} were obtained for the first time, and specifically that {H_m \ll m^3 e^{4m}} for all {m \geq 1}, with the improvements {H_2 \leq 600} and {H_m \ll m^3 e^{2m}} on the Elliott-Halberstam conjecture. (I had independently discovered the multidimensional sieve idea, although I did not obtain Maynard’s specific numerical results, and my asymptotic bounds were a bit weaker.)

In Polymath8b, we obtain some further improvements. Unconditionally, we have {H_1 \leq 246} and {H_m \ll m e^{(4 - \frac{28}{157}) m}}, together with some explicit bounds on {H_2,H_3,H_4,H_5}; on the Elliott-Halberstam conjecture we have {H_m \ll m e^{2m}} and some numerical improvements to the {H_2,H_3,H_4,H_5} bounds; and assuming the generalised Elliott-Halberstam conjecture we have the bound {H_1 \leq 6}, which is best possible from sieve-theoretic methods thanks to the parity problem obstruction.

There were a variety of methods used to establish these results. Maynard’s paper obtained a criterion for bounding {H_m} which reduced to finding a good solution to a certain multidimensional variational problem. When the dimension parameter {k} was relatively small (e.g. {k \leq 100}), we were able to obtain good numerical solutions both by continuing the method of Maynard (using a basis of symmetric polynomials), or by using a Krylov iteration scheme. For large {k}, we refined the asymptotics and obtained near-optimal solutions of the variational problem. For the {H_1} bounds, we extended the reach of the multidimensional Selberg sieve (particularly under the assumption of the generalised Elliott-Halberstam conjecture) by allowing the function {F} in the multidimensional variational problem to extend to a larger region of space than was previously admissible, albeit with some tricky new constraints on {F} (and penalties in the variational problem). This required some unusual sieve-theoretic manipulations, notably an “epsilon trick”, ultimately relying on the elementary inequality {(a+b)^2 \geq a^2 + 2ab}, that allowed one to get non-trivial lower bounds for sums such as {\sum_n (a(n)+b(n))^2} even if the sum {\sum_n b(n)^2} had no non-trivial estimates available; and a way to estimate divisor sums such as {\sum_{n\leq x} \sum_{d|n} \lambda_d} even if {d} was permitted to be comparable to or even exceed {x}, by using the fundamental theorem of arithmetic to factorise {n} (after restricting to the case when {n} is almost prime). I hope that these sieve-theoretic tricks will be useful in future work in the subject.

With this paper, the Polymath8 project is almost complete; there is still a little bit of scope to push our methods further and get some modest improvement for instance to the {H_1 \leq 246} bound, but this would require a substantial amount of effort, and it is probably best to instead wait for some new breakthrough in the subject to come along. One final task we are performing is to write up a retrospective article on both the 8a and 8b experiences, an incomplete writeup of which can be found here. If anyone wishes to contribute some commentary on these projects (whether you were an active contributor, an occasional contributor, or a silent “lurker” in the online discussion), please feel free to do so in the comments to this post.

There are multiple purposes to this blog post.

The first purpose is to announce the uploading of the paper “New equidistribution estimates of Zhang type, and bounded gaps between primes” by D.H.J. Polymath, which is the main output of the Polymath8a project on bounded gaps between primes, to the arXiv, and to describe the main results of this paper below the fold.

The second purpose is to roll over the previous thread on all remaining Polymath8a-related matters (e.g. updates on the submission status of the paper) to a fresh thread. (Discussion of the ongoing Polymath8b project is however being kept on a separate thread, to try to reduce confusion.)

The final purpose of this post is to coordinate the writing of a retrospective article on the Polymath8 experience, which has been solicited for the Newsletter of the European Mathematical Society. I suppose that this could encompass both the Polymath8a and Polymath8b projects, even though the second one is still ongoing (but I think we will soon be entering the endgame there). I think there would be two main purposes of such a retrospective article. The first one would be to tell a story about the process of conducting mathematical research, rather than just describe the outcome of such research; this is an important aspect of the subject which is given almost no attention in most mathematical writing, and it would be good to be able to capture some sense of this process while memories are still relatively fresh. The other would be to draw some tentative conclusions with regards to what the strengths and weaknesses of a Polymath project are, and how appropriate such a format would be for other mathematical problems than bounded gaps between primes. In my opinion, the bounded gaps problem had some fairly unique features that made it particularly amenable to a Polymath project, such as (a) a high level of interest amongst the mathematical community in the problem; (b) a very focused objective (“improve {H}!”), which naturally provided an obvious metric to measure progress; (c) the modular nature of the project, which allowed for people to focus on one aspect of the problem only, and still make contributions to the final goal; and (d) a very reasonable level of ambition (for instance, we did not attempt to prove the twin prime conjecture, which in my opinion would make a terrible Polymath project at our current level of mathematical technology). This is not an exhaustive list of helpful features of the problem; I would welcome other diagnoses of the project by other participants.

With these two objectives in mind, I propose a format for the retrospective article consisting of a brief introduction to the polymath concept in general and the polymath8 project in particular, followed by a collection of essentially independent contributions by different participants on their own experiences and thoughts. Finally we could have a conclusion section in which we make some general remarks on the polymath project (such as the remarks above). I’ve started a dropbox subfolder for this article (currently in a very skeletal outline form only), and will begin writing a section on my own experiences; other participants are of course encouraged to add their own sections (it is probably best to create separate files for these, and then input them into the main file retrospective.tex, to reduce edit conflicts. If there are participants who wish to contribute but do not currently have access to the Dropbox folder, please email me and I will try to have you added (or else you can supply your thoughts by email, or in the comments to this post; we may have a section for shorter miscellaneous comments from more casual participants, for people who don’t wish to write a lengthy essay on the subject).

As for deadlines, the EMS Newsletter would like a submitted article by mid-April in order to make the June issue, but in the worst case, it will just be held over until the issue after that.

Read the rest of this entry »

I’ve just uploaded to the arXiv the paper “Finite time blowup for an averaged three-dimensional Navier-Stokes equation“, submitted to J. Amer. Math. Soc.. The main purpose of this paper is to formalise the “supercriticality barrier” for the global regularity problem for the Navier-Stokes equation, which roughly speaking asserts that it is not possible to establish global regularity by any “abstract” approach which only uses upper bound function space estimates on the nonlinear part of the equation, combined with the energy identity. This is done by constructing a modification of the Navier-Stokes equations with a nonlinearity that obeys essentially all of the function space estimates that the true Navier-Stokes nonlinearity does, and which also obeys the energy identity, but for which one can construct solutions that blow up in finite time. Results of this type had been previously established by Montgomery-Smith, Gallagher-Paicu, and Li-Sinai for variants of the Navier-Stokes equation without the energy identity, and by Katz-Pavlovic and by Cheskidov for dyadic analogues of the Navier-Stokes equations in five and higher dimensions that obeyed the energy identity (see also the work of Plechac and Sverak and of Hou and Lei that also suggest blowup for other Navier-Stokes type models obeying the energy identity in five and higher dimensions), but to my knowledge this is the first blowup result for a Navier-Stokes type equation in three dimensions that also obeys the energy identity. Intriguingly, the method of proof in fact hints at a possible route to establishing blowup for the true Navier-Stokes equations, which I am now increasingly inclined to believe is the case (albeit for a very small set of initial data).

To state the results more precisely, recall that the Navier-Stokes equations can be written in the form

\displaystyle  \partial_t u + (u \cdot \nabla) u = \nu \Delta u + \nabla p

for a divergence-free velocity field {u} and a pressure field {p}, where {\nu>0} is the viscosity, which we will normalise to be one. We will work in the non-periodic setting, so the spatial domain is {{\bf R}^3}, and for sake of exposition I will not discuss matters of regularity or decay of the solution (but we will always be working with strong notions of solution here rather than weak ones). Applying the Leray projection {P} to divergence-free vector fields to this equation, we can eliminate the pressure, and obtain an evolution equation

\displaystyle  \partial_t u = \Delta u + B(u,u) \ \ \ \ \ (1)

purely for the velocity field, where {B} is a certain bilinear operator on divergence-free vector fields (specifically, {B(u,v) = -\frac{1}{2} P( (u \cdot \nabla) v + (v \cdot \nabla) u)}. The global regularity problem for Navier-Stokes is then equivalent to the global regularity problem for the evolution equation (1).

An important feature of the bilinear operator {B} appearing in (1) is the cancellation law

\displaystyle  \langle B(u,u), u \rangle = 0

(using the {L^2} inner product on divergence-free vector fields), which leads in particular to the fundamental energy identity

\displaystyle  \frac{1}{2} \int_{{\bf R}^3} |u(T,x)|^2\ dx + \int_0^T \int_{{\bf R}^3} |\nabla u(t,x)|^2\ dx dt = \frac{1}{2} \int_{{\bf R}^3} |u(0,x)|^2\ dx.

This identity (and its consequences) provide essentially the only known a priori bound on solutions to the Navier-Stokes equations from large data and arbitrary times. Unfortunately, as discussed in this previous post, the quantities controlled by the energy identity are supercritical with respect to scaling, which is the fundamental obstacle that has defeated all attempts to solve the global regularity problem for Navier-Stokes without any additional assumptions on the data or solution (e.g. perturbative hypotheses, or a priori control on a critical norm such as the {L^\infty_t L^3_x} norm).

Our main result is then (slightly informally stated) as follows

Theorem 1 There exists an averaged version {\tilde B} of the bilinear operator {B}, of the form

\displaystyle  \tilde B(u,v) := \int_\Omega m_{3,\omega}(D) Rot_{3,\omega}

\displaystyle B( m_{1,\omega}(D) Rot_{1,\omega} u, m_{2,\omega}(D) Rot_{2,\omega} v )\ d\mu(\omega)

for some probability space {(\Omega, \mu)}, some spatial rotation operators {Rot_{i,\omega}} for {i=1,2,3}, and some Fourier multipliers {m_{i,\omega}} of order {0}, for which one still has the cancellation law

\displaystyle  \langle \tilde B(u,u), u \rangle = 0

and for which the averaged Navier-Stokes equation

\displaystyle  \partial_t u = \Delta u + \tilde B(u,u) \ \ \ \ \ (2)

admits solutions that blow up in finite time.

(There are some integrability conditions on the Fourier multipliers {m_{i,\omega}} required in the above theorem in order for the conclusion to be non-trivial, but I am omitting them here for sake of exposition.)

Because spatial rotations and Fourier multipliers of order {0} are bounded on most function spaces, {\tilde B} automatically obeys almost all of the upper bound estimates that {B} does. Thus, this theorem blocks any attempt to prove global regularity for the true Navier-Stokes equations which relies purely on the energy identity and on upper bound estimates for the nonlinearity; one must use some additional structure of the nonlinear operator {B} which is not shared by an averaged version {\tilde B}. Such additional structure certainly exists – for instance, the Navier-Stokes equation has a vorticity formulation involving only differential operators rather than pseudodifferential ones, whereas a general equation of the form (2) does not. However, “abstract” approaches to global regularity generally do not exploit such structure, and thus cannot be used to affirmatively answer the Navier-Stokes problem.

It turns out that the particular averaged bilinear operator {B} that we will use will be a finite linear combination of local cascade operators, which take the form

\displaystyle  C(u,v) := \sum_{n \in {\bf Z}} (1+\epsilon_0)^{5n/2} \langle u, \psi_{1,n} \rangle \langle v, \psi_{2,n} \rangle \psi_{3,n}

where {\epsilon_0>0} is a small parameter, {\psi_1,\psi_2,\psi_3} are Schwartz vector fields whose Fourier transform is supported on an annulus, and {\psi_{i,n}(x) := (1+\epsilon_0)^{3n/2} \psi_i( (1+\epsilon_0)^n x)} is an {L^2}-rescaled version of {\psi_i} (basically a “wavelet” of wavelength about {(1+\epsilon_0)^{-n}} centred at the origin). Such operators were essentially introduced by Katz and Pavlovic as dyadic models for {B}; they have the essentially the same scaling property as {B} (except that one can only scale along powers of {1+\epsilon_0}, rather than over all positive reals), and in fact they can be expressed as an average of {B} in the sense of the above theorem, as can be shown after a somewhat tedious amount of Fourier-analytic symbol manipulations.

If we consider nonlinearities {\tilde B} which are a finite linear combination of local cascade operators, then the equation (2) more or less collapses to a system of ODE in certain “wavelet coefficients” of {u}. The precise ODE that shows up depends on what precise combination of local cascade operators one is using. Katz and Pavlovic essentially considered a single cascade operator together with its “adjoint” (needed to preserve the energy identity), and arrived (more or less) at the system of ODE

\displaystyle  \partial_t X_n = - (1+\epsilon_0)^{2n} X_n + (1+\epsilon_0)^{\frac{5}{2}(n-1)} X_{n-1}^2 - (1+\epsilon_0)^{\frac{5}{2} n} X_n X_{n+1} \ \ \ \ \ (3)

where {X_n: [0,T] \rightarrow {\bf R}} are scalar fields for each integer {n}. (Actually, Katz-Pavlovic worked with a technical variant of this particular equation, but the differences are not so important for this current discussion.) Note that the quadratic terms on the RHS carry a higher exponent of {1+\epsilon_0} than the dissipation term; this reflects the supercritical nature of this evolution (the energy {\frac{1}{2} \sum_n X_n^2} is monotone decreasing in this flow, so the natural size of {X_n} given the control on the energy is {O(1)}). There is a slight technical issue with the dissipation if one wishes to embed (3) into an equation of the form (2), but it is minor and I will not discuss it further here.

In principle, if the {X_n} mode has size comparable to {1} at some time {t_n}, then energy should flow from {X_n} to {X_{n+1}} at a rate comparable to {(1+\epsilon_0)^{\frac{5}{2} n}}, so that by time {t_{n+1} \approx t_n + (1+\epsilon_0)^{-\frac{5}{2} n}} or so, most of the energy of {X_n} should have drained into the {X_{n+1}} mode (with hardly any energy dissipated). Since the series {\sum_{n \geq 1} (1+\epsilon_0)^{-\frac{5}{2} n}} is summable, this suggests finite time blowup for this ODE as the energy races ever more quickly to higher and higher modes. Such a scenario was indeed established by Katz and Pavlovic (and refined by Cheskidov) if the dissipation strength {(1+\epsilon)^{2n}} was weakened somewhat (the exponent {2} has to be lowered to be less than {\frac{5}{3}}). As mentioned above, this is enough to give a version of Theorem 1 in five and higher dimensions.

On the other hand, it was shown a few years ago by Barbato, Morandin, and Romito that (3) in fact admits global smooth solutions (at least in the dyadic case {\epsilon_0=1}, and assuming non-negative initial data). Roughly speaking, the problem is that as energy is being transferred from {X_n} to {X_{n+1}}, energy is also simultaneously being transferred from {X_{n+1}} to {X_{n+2}}, and as such the solution races off to higher modes a bit too prematurely, without absorbing all of the energy from lower modes. This weakens the strength of the blowup to the point where the moderately strong dissipation in (3) is enough to kill the high frequency cascade before a true singularity occurs. Because of this, the original Katz-Pavlovic model cannot quite be used to establish Theorem 1 in three dimensions. (Actually, the original Katz-Pavlovic model had some additional dispersive features which allowed for another proof of global smooth solutions, which is an unpublished result of Nazarov.)

To get around this, I had to “engineer” an ODE system with similar features to (3) (namely, a quadratic nonlinearity, a monotone total energy, and the indicated exponents of {(1+\epsilon_0)} for both the dissipation term and the quadratic terms), but for which the cascade of energy from scale {n} to scale {n+1} was not interrupted by the cascade of energy from scale {n+1} to scale {n+2}. To do this, I needed to insert a delay in the cascade process (so that after energy was dumped into scale {n}, it would take some time before the energy would start to transfer to scale {n+1}), but the process also needed to be abrupt (once the process of energy transfer started, it needed to conclude very quickly, before the delayed transfer for the next scale kicked in). It turned out that one could build a “quadratic circuit” out of some basic “quadratic gates” (analogous to how an electrical circuit could be built out of basic gates such as amplifiers or resistors) that achieved this task, leading to an ODE system essentially of the form

\displaystyle \partial_t X_{1,n} = - (1+\epsilon_0)^{2n} X_{1,n}

\displaystyle  + (1+\epsilon_0)^{5n/2} (- \epsilon^{-2} X_{3,n} X_{4,n} - \epsilon X_{1,n} X_{2,n} - \epsilon^2 \exp(-K^{10}) X_{1,n} X_{3,n}

\displaystyle  + K X_{4,n-1}^2)

\displaystyle  \partial_t X_{2,n} = - (1+\epsilon_0)^{2n} X_{2,n} + (1+\epsilon_0)^{5n/2} (\epsilon X_{1,n}^2 - \epsilon^{-1} K^{10} X_{3,n}^2)

\displaystyle  \partial_t X_{3,n} = - (1+\epsilon_0)^{2n} X_{3,n} + (1+\epsilon_0)^{5n/2} (\epsilon^2 \exp(-K^{10}) X_{1,n}^2

\displaystyle + \epsilon^{-1} K^{10} X_{2,n} X_{3,n} )

\displaystyle  \partial_t X_{4,n} =- (1+\epsilon_0)^{2n} X_{4,n} + (1+\epsilon_0)^{5n/2} (\epsilon^{-2} X_{3,n} X_{1,n}

\displaystyle - (1+\epsilon_0)^{5/2} K X_{4,n} X_{1,n+1})

where {K \geq 1} is a suitable large parameter and {\epsilon > 0} is a suitable small parameter (much smaller than {1/K}). To visualise the dynamics of such a system, I found it useful to describe this system graphically by a “circuit diagram” that is analogous (but not identical) to the circuit diagrams arising in electrical engineering:


The coupling constants here range widely from being very large to very small; in practice, this makes the {X_{2,n}} and {X_{3,n}} modes absorb very little energy, but exert a sizeable influence on the remaining modes. If a lot of energy is suddenly dumped into {X_{1,n}}, what happens next is roughly as follows: for a moderate period of time, nothing much happens other than a trickle of energy into {X_{2,n}}, which in turn causes a rapid exponential growth of {X_{3,n}} (from a very low base). After this delay, {X_{3,n}} suddenly crosses a certain threshold, at which point it causes {X_{1,n}} and {X_{4,n}} to exchange energy back and forth with extreme speed. The energy from {X_{4,n}} then rapidly drains into {X_{1,n+1}}, and the process begins again (with a slight loss in energy due to the dissipation). If one plots the total energy {E_n := \frac{1}{2} ( X_{1,n}^2 + X_{2,n}^2 + X_{3,n}^2 + X_{4,n}^2 )} as a function of time, it looks schematically like this:


As in the previous heuristic discussion, the time between cascades from one frequency scale to the next decay exponentially, leading to blowup at some finite time {T}. (One could describe the dynamics here as being similar to the famous “lighting the beacons” scene in the Lord of the Rings movies, except that (a) as each beacon gets ignited, the previous one is extinguished, as per the energy identity; (b) the time between beacon lightings decrease exponentially; and (c) there is no soundtrack.)

There is a real (but remote) possibility that this sort of construction can be adapted to the true Navier-Stokes equations. The basic blowup mechanism in the averaged equation is that of a von Neumann machine, or more precisely a construct (built within the laws of the inviscid evolution {\partial_t u = \tilde B(u,u)}) that, after some time delay, manages to suddenly create a replica of itself at a finer scale (and to largely erase its original instantiation in the process). In principle, such a von Neumann machine could also be built out of the laws of the inviscid form of the Navier-Stokes equations (i.e. the Euler equations). In physical terms, one would have to build the machine purely out of an ideal fluid (i.e. an inviscid incompressible fluid). If one could somehow create enough “logic gates” out of ideal fluid, one could presumably build a sort of “fluid computer”, at which point the task of building a von Neumann machine appears to reduce to a software engineering exercise rather than a PDE problem (providing that the gates are suitably stable with respect to perturbations, but (as with actual computers) this can presumably be done by converting the analog signals of fluid mechanics into a more error-resistant digital form). The key thing missing in this program (in both senses of the word) to establish blowup for Navier-Stokes is to construct the logic gates within the laws of ideal fluids. (Compare with the situation for cellular automata such as Conway’s “Game of Life“, in which Turing complete computers, universal constructors, and replicators have all been built within the laws of that game.)

I’ve just finished the first draft of my book “Expansion in finite simple groups of Lie type“, which is  based in the lecture notes for my graduate course on this topic that were previously posted on this blog.  It also contains some newer material, such as the notes on Lie algebras and Lie groups that I posted most recently here.

As always, corrections or comments are greatly appreciated (and errata will be collected at this page).

I’ve just uploaded to the arXiv my article “Algebraic combinatorial geometry: the polynomial method in arithmetic combinatorics, incidence combinatorics, and number theory“, submitted to the new journal “EMS surveys in the mathematical sciences“.  This is the first draft of a survey article on the polynomial method – a technique in combinatorics and number theory for controlling a relevant set of points by comparing it with the zero set of a suitably chosen polynomial, and then using tools from algebraic geometry (e.g. Bezout’s theorem) on that zero set. As such, the method combines algebraic geometry with combinatorial geometry, and could be viewed as the philosophy of a combined field which I dub “algebraic combinatorial geometry”.   There is also an important extension of this method when one is working overthe reals, in which methods from algebraic topology (e.g. the ham sandwich theorem and its generalisation to polynomials), and not just algebraic geometry, come into play also.

The polynomial method has been used independently many times in mathematics; for instance, it plays a key role in the proof of Baker’s theorem in transcendence theory, or Stepanov’s method in giving an elementary proof of the Riemann hypothesis for finite fields over curves; in combinatorics, the nullstellenatz of Alon is also another relatively early use of the polynomial method.  More recently, it underlies Dvir’s proof of the Kakeya conjecture over finite fields and Guth and Katz’s near-complete solution to the Erdos distance problem in the plane, and can be used to give a short proof of the Szemeredi-Trotter theorem.  One of the aims of this survey is to try to present all of these disparate applications of the polynomial method in a somewhat unified context; my hope is that there will eventually be a systematic foundation for algebraic combinatorial geometry which naturally contains all of these different instances the polynomial method (and also suggests new instances to explore); but the field is unfortunately not at that stage of maturity yet.

This is something of a first draft, so comments and suggestions are even more welcome than usual.  (For instance, I have already had my attention drawn to some additional uses of the polynomial method in the literature that I was not previously aware of.)


RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.

Get every new post delivered to your Inbox.

Join 4,047 other followers