This is the seventh thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} can be found at the wiki page.

The current focus is on improving the upper bound on {H_1} under the assumption of the generalised Elliott-Halberstam conjecture (GEH) from {H_1 \leq 8} to {H_1 \leq 6}. Very recently, we have been able to exploit GEH more fully, leading to a promising new expansion of the sieve support region. The problem now reduces to the following:

Problem 1 Does there exist a (not necessarily convex) polytope {R \subset [0,2]^3} with quantities {0 \leq \varepsilon_1,\varepsilon_2,\varepsilon_3 \leq 1}, and a non-trivial square-integrable function {F: {\bf R}^3 \rightarrow {\bf R}} supported on {R} such that

  • {R + R \subset \{ (x,y,z) \in [0,4]^3: \min(x+y,y+z,z+x) \leq 2 \},}
  • {\int_0^\infty F(x,y,z)\ dx = 0} when {y+z \geq 1+\varepsilon_1};
  • {\int_0^\infty F(x,y,z)\ dy = 0} when {x+z \geq 1+\varepsilon_2};
  • {\int_0^\infty F(x,y,z)\ dz = 0} when {x+y \geq 1+\varepsilon_3};

and such that we have the inequality

\displaystyle  \int_{y+z \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y,z)\ dx)^2\ dy dz

\displaystyle + \int_{z+x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y,z)\ dy)^2\ dz dx

\displaystyle + \int_{x+y \leq 1-\varepsilon_3} (\int_{\bf R} F(x,y,z)\ dz)^2\ dx dy

\displaystyle  > 2 \int_R F(x,y,z)^2\ dx dy dz?

An affirmative answer to this question will imply {H_1 \leq 6} on GEH. We are “within two percent” of this claim; we cannot quite reach {2} yet, but have got as far as {1.962998}. However, we have not yet fully optimised {F} in the above problem. In particular, the simplex

\displaystyle  R = \{ (x,y,z) \in [0,2]^3: x+y+z \leq 3/2 \}

is now available, and should lead to some noticeable improvement in the numerology.

There is also a very slim chance that the twin prime conjecture is now provable on GEH. It would require an affirmative solution to the following problem:

Problem 2 Does there exist a (not necessarily convex) polytope {R \subset [0,2]^2} with quantities {0 \leq \varepsilon_1,\varepsilon_2 \leq 1}, and a non-trivial square-integrable function {F: {\bf R}^2 \rightarrow {\bf R}} supported on {R} such that

  • {R + R \subset \{ (x,y) \in [0,4]^2: \min(x,y) \leq 2 \}}

    \displaystyle  = [0,2] \times [0,4] \cup [0,4] \times [0,2],

  • {\int_0^\infty F(x,y)\ dx = 0} when {y \geq 1+\varepsilon_1};
  • {\int_0^\infty F(x,y)\ dy = 0} when {x \geq 1+\varepsilon_2};

and such that we have the inequality

\displaystyle  \int_{y \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y)\ dx)^2\ dy

\displaystyle + \int_{x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y)\ dy)^2\ dx

\displaystyle  > 2 \int_R F(x,y)^2\ dx dy?

We suspect that the answer to this question is negative, but have not formally ruled it out yet.

For the rest of this post, I will justify why positive answers to these sorts of variational problems are sufficient to get bounds on {H_1} (or more generally {H_m}).

— 1. Crude sieve bounds —

Let the notation be as in the Polymath8a paper, thus we have an admissible tuple {(h_1,\dots,h_k)}, a residue class {b\ (W)} with {b+h_i} coprime to {W} for all {i=1,\dots,k}, and an asymptotic parameter {x} going off to infinity. It will be convenient to use the notation

\displaystyle  \log_x y := \frac{\log y}{\log x}.

We let {I} be the interval

\displaystyle  I := \{ n \in [x,2x]: n = b\ (W) \}

and for each fixed smooth compactly supported function {F: [0,+\infty) \rightarrow {\bf R}}, we let {\alpha_F: {\bf Z} \rightarrow {\bf R}} denote the divisor sum

\displaystyle  \alpha_F(n) := \sum_{d|n} \mu(d) F(\log_x d).

We wish to understand the correlation of various products of divisor sums on {I}. For instance, in this previous blog post, the asymptotic

\displaystyle  \sum_{n \in I} \prod_{i=1}^k \alpha_{F_i}(n+h_i) \alpha_{G_i}(n+h_i) = \delta^k |I| ( \prod_{i=1}^k \int F'_i(t) G'_i(t)\ dt + o(1) ) \ \ \ \ \ (1)

was established whenever one has the support condition

\displaystyle  \sum_{i=1}^k S(F_i) + S(G_i) < 1 \ \ \ \ \ (2)

where {S(F) := \sup \{ x: F(x) \neq 0 \}} is the outer edge of the support of {F}, and

\displaystyle  \delta := \frac{W}{\phi(W) \log x}.

We are now interested in understanding the asymptotics when (2) fails. We have a crude pointwise upper bound:

Lemma 3 Let {F: [0,+\infty) \rightarrow {\bf R}} be a fixed smooth compactly supported function. Then for any natural number {n},

\displaystyle  |\alpha_F(n)| \ll \int_0^\infty [ \prod_{p|n} O( \min( t \log_x p, 1 ) ) ] \frac{dt}{1+|t|^A}

for any fixed {A>0}. More generally, for any fixed number {F_1,\ldots,F_j: [0,+\infty) \rightarrow {\bf R}} of fixed smooth compactly supported functions, one has

\displaystyle  |\alpha_{F_1}(n)| \dots |\alpha_{F_j}(n)| \ll \int_0^\infty [ \prod_{p|n} O( \min( t \log_x p, 1 ) ) ] \frac{dt}{1+t^A} \ \ \ \ \ (3)

Proof: We extend {F} smoothly to all of {{\bf R}} as a compactly supported function, and write the Fourier expansion

\displaystyle  F(x) = \int_{\bf R} \hat F(t) e^{-ixt}\ dt

for some rapidly decreasing function {\hat F(t)}. Then

\displaystyle  \alpha_F(n) = \int_{\bf R} \hat F(t) \sum_{d|n} \mu(d) \exp( - i t \log_x p )\ dt

\displaystyle  = \int_{\bf R} \hat F(t) \prod_{p|n} (1 - \exp( - i t \log_x p )\ dt.

Taking absolute values, we conclude that

\displaystyle  |\alpha_F(n)| \leq \int_{\bf R} |\hat F(t)| \prod_{p|n} |1 - \exp( - i t \log_x p|\ dt.

Since {|1 - \exp( - i t \log_x p| \ll \min( |t| \log_x p, 1 )}, the first claim now follows from the rapid decrease of {\hat F}. To prove the second claim, we use the first claim to bound the left-hand side of (3) by

\displaystyle  \int_0^\infty \dots \int_0^\infty [\prod_{p|n} O( \prod_{i=1}^j \min( t_i \log_x p, 1 ) )] \frac{dt_1 \dots dt_j}{\prod_{i=1}^j (1+t_i^A)}.


\displaystyle  \prod_{i=1}^j \min( t_i \log_x p, 1 ) \ll \min( (t_1+\dots+t_j) \log_x p, 1)


\displaystyle \prod_{i=1}^j (1+t_i^A) \gg 1 + (t_1+\dots+t_j)^A

the claim follows after a change of variables. \Box

Lemma 4 For each {i=1,\dots,k}, let {j_i \geq 1} and {m_i \geq 0} be fixed, and let {F_{i,1},\dots,F_{i,j_i}: [0,+\infty) \rightarrow {\bf R}} be fixed smooth compactly supported functions. Then

\displaystyle  \sum_{n \in I} \prod_{i=1}^k (\prod_{l=1}^{j_i} |\alpha_{F_l}(n+h_i)| \tau(n+h_i)^{m_i}) \ll \delta^k |I|. \ \ \ \ \ (4)

We also have the variant

\displaystyle  \sum_{n \in I} \prod_{i=1}^k (\prod_{l=1}^{j_i} |\alpha_{F_l}(n+h_i)| \tau(n+h_i)^{m_i}) 1_{p(n+h_s) \leq x^\epsilon} \ll \epsilon \delta^k |I|. \ \ \ \ \ (5)

for any {\epsilon>0} and {s=1,\dots,k}, where {p(n)} is the least prime factor of {n}.

The intuition here is that each of the {\alpha_{F_l}(n+h_i)} is mostly bounded and mostly supported on the {n} for which {n+h_i} is almost prime (so in particular {\tau(n+h_i)} is bounded), which has a density of about {\delta} in {I}.

Proof: From (3) (and bounding {\tau(n)^{m_i} \leq \prod_{p|n} O(1)}), we can bound the left-hand side of (4) by

\displaystyle  \int_0^\infty \dots \int_0^\infty \sum_{n \in I} \prod_{i=1}^k \prod_{p_i|n+h_i} O( \min(t_i \log_x p_i, 1) ) \frac{dt_1 \dots dt_k}{\prod_{i=1}^k (1+t_i^A)}.

Let {c>0} be a small fixed number ({c=1/10k} will do). For each {n+h_i}, we let {p_{i,1} \leq \ldots \leq p_{i,\Omega(n+h_i)}} be the prime factors of {n+h_i} in increasing order (counting multiplicity), and let {p_{i,1} \dots p_{i,r_i}} be the largest product of consecutive primes factors that is bounded by {x^c}. In particular, we see that

\displaystyle  p_{i,1} \dots p_{i,r_i+1} \geq x^c

and hence

\displaystyle  p_{i,r_i+1} \geq x^{c/(r_i+1)}

which in particular implies that {\Omega(n+h_i) = O( r_i)}. This implies that

\displaystyle  \prod_{p_i|n+h_i} O( \min(t_i \log_x p_i, 1) ) \ll \prod_{j=1}^{r_i} O( \min( t_i \log_x p_{i,j}, 1) ).

Now observe that {n+h_i = p_{i,1} \dots p_{i,r_i} m}, where {m} is {p_{i,r_i+1}}-rough (i.e. no prime factors less than {p_{i,r_i+1}}). In particular, it is {x^{c/(r_i+1)}}-rough. Thus we can bound the left-hand side of (4) by

\displaystyle  \sum_{r_1,\dots,r_k \geq 0} \sum_{p_{i,1} \leq \dots \leq p_{i,r_i} \forall i=1,\dots,k} \int_0^\infty \dots \int_0^\infty

\displaystyle  (\prod_{i=1}^k \prod_{j=1}^{r_i} O(\min( t_i \log_x p_{i,j}, 1)))

\displaystyle  \sum_{n \in I} \prod_{i=1}^k 1_{n+h_i = p_{i,1} \dots p_{i,r_i} m; m \hbox{ is } x^{c/(r_i+1)}\hbox{-rough}}

\displaystyle  \frac{dt_1 \dots dt_k}{\prod_{i=1}^k (1+t_i^A)}.

By using a standard upper bound sieve (and taking {c} small enough), the quantity

\displaystyle  \sum_{n \in I} \prod_{i=1}^k 1_{n+h_i = p_{i,1} \dots p_{i,r_i} m; m \hbox{ is } x^{c/(r_i+1)}\hbox{-rough}}

may be bounded by

\displaystyle  \delta^k \frac{|I|}{\prod_{i=1}^k \prod_{j=1}^{r_i} p_{i,j}} \prod_{i=1}^k O( r_i + 1 ).

Since {O(r_i+1) \leq \prod_{j=1}^{r_i} O(1)}, we can thus bound the left-hand side of (4) by

\displaystyle  \delta^k |I| \sum_{r_1,\dots,r_k \geq 0} \sum_{p_{i,1} \leq \dots \leq p_{i,r_i} \forall i=1,\dots,k} \int_0^\infty \dots \int_0^\infty

\displaystyle  (\prod_{i=1}^k \prod_{j=1}^{r_i} O(\frac{\min( t_i \log_x p_{i,j}, 1)}{p_{i,j}}))

\displaystyle  \frac{dt_1 \dots dt_k}{\prod_{i=1}^k (1+t_i^A)}.

We can bound this by

\displaystyle  \delta^k |I| \sum_{r_1,\dots,r_k \geq 0} \int_0^\infty \dots \int_0^\infty

\displaystyle  (\prod_{i=1}^k \frac{1}{r_i!} O(\sum_{p \leq x^c} \frac{\min( t_i \log_x p, 1)}{p})^{r_i})

\displaystyle  \frac{dt_1 \dots dt_k}{\prod_{i=1}^k (1+t_i^A)}

(strictly speaking one has some additional contribution coming from repeated primes {p_{i,j}=p_{i,j+1}}, but these can be eliminated in a number of ways, e.g. by restricting initially to square-free {n+h_i}). By Mertens’ theorem we have

\displaystyle  \sum_{p \leq x^c} \frac{\min( t_i \log_x p, 1)}{p} = O( 1 + \log(1+t_i) ),

\endand then by summing the series in {r_i}, we can bound the left-hand side of (4) by

\displaystyle  \delta^k |I| \int_0^\infty \dots \int_0^\infty

\displaystyle  \prod_{i=1}^k \exp( O( 1 + \log(1+t_i) ) )

\displaystyle  \frac{dt_1 \dots dt_k}{\prod_{i=1}^k (1+t_i^A)}.

which for {A} large enough is {O(\delta^k |I|)} as required. This proves (4).

The proof of (5) is similar, except that (assuming {\epsilon} small, as we may) {r_s} is forced to be at least {1}, and {\log_x p_{s,1}} is at most {\epsilon}. From this we may effectively extract an additional factor of {\min( \epsilon t_s, 1 )} (times a loss of {O(1+\log(1+t_s))} due to having to reduce {r_s!} to {(r_s-1)!}), which gives rise to the additional gain of {\epsilon}. \Box

— 2. The generalised Elliott-Halberstam conjecture —

We begin by stating the conjecture {GEH[\theta]} more formally, using (a slightly weaker form of) the version from this paper of Bombieri, Friedlander, and Iwaniec. We use the notation from the Polymath8a paper.

Conjecture 5 (GEH) Let {x^\epsilon \leq N,M \leq x^{1-\epsilon}} for some fixed {\epsilon>0}, be such that {NM \sim x}, and let {\alpha, \beta} be coefficient sequences at scale {N,M}. Then

\displaystyle  \sum_{q \lessapprox x^\theta} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\alpha * \beta; a\ (q))| \ll x \log^{-A} x

for any fixed {A>0}.

We use GEH to refer to the assertion that {GEH[\theta]} holds for all {0 < \theta < 1}. As shown by Motohashi, a modification of the proof of the Bombieri-Vinogradov theorem shows that {GEH[\theta]} is true for {0 < \theta < 1/2}. (It is possible that some modification of the arguments of Zhang give some weak version of GEH for some {\theta} slightly above {1/2}, but we will not focus on that topic here.)

For our purposes, we will need to apply GEH to functions supported on products of {r} primes for a fixed {r} (generalising the von Mangoldt function, which is the focus of the Elliott-Halberstam conjecture EH). More precisely, we have

Proposition 6 Assume {GEH[\theta]} holds. Let {r \geq 1} and {\epsilon>0} be fixed, let {\Delta_{r,\epsilon} := \{ (t_1,\dots,t_r) \in [\epsilon,1]^r: t_1 \leq \dots \leq t_r; t_1+\dots+t_r=1\}}, and let {F: \Delta_{r,\epsilon} \rightarrow {\bf R}} be a fixed smooth function. Let {\tilde F: {\bf N} \rightarrow {\bf R}} be the function defined by setting

\displaystyle  \tilde F(n) := F( \log_n p_i, \dots, \log_n p_r)

whenever {n=p_1 \dots p_k} is the product of {r} distinct primes {p_1 < \dots < p_r} with {p_1 \geq x^\epsilon} for some fixed {\epsilon>0}, and {\tilde F(n)=0} otherwise. Then

\displaystyle  \sum_{q \lessapprox x^\theta} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\tilde F; a\ (q))| \ll x \log^{-A} x

for any fixed {A>0}.

Remark: it may be possible to get some version of this proposition just from EH using Bombieri’s asymptotic sieve.

Proof: (Sketch) This is a standard partitioning argument (not sure where it appears first, though). We choose a fixed {A'>0} that is sufficiently large depending on {A}. We can decompose the primes from {x^\epsilon} to {x} into {O( \log^{A'+1} x)} intervals {[y, (1+\log^{-A} x) y]}. This splits {\tilde F} into {O( \log^{rA'+r} x)} pieces, depending on which intervals the {p_i} lie in. The contribution when two primes lie in the same interval, or when the products of the specified intervals touches the boundary of {[x,2x]}, can be shown to be negligible by crude divisor function estimates if {A'} is large enough (a similar argument appears in the Polymath8a paper), basically because there are only {O( \log^{(r-1)A'+r-1} x)} such terms, and each one contributes {O( x \log^{-rA'+O(1)} x)} to the total. For the remaining pieces, one can approximate {\tilde F} by a constant, up to errors which can also be shown to be negligible by crude estimates for {A'} large enough (each term contributes {O( x \log^{-(r+1)A'+O(1)} x)}), and then {\tilde F} can be modeled by a convolution of {r} coefficient sequences at various scales between {x^\epsilon} and {x}, at which point one can use GEH to conclude. \Box

Corollary 7 Assume {GEH[\theta]} holds for some {0 < \theta < 1}. Let {r\geq 1} and {\epsilon>0} be fixed, let {F: \Delta_{r,\epsilon} \rightarrow {\bf R}} be fixed and smooth, and let {\tilde F} be as in the previous proposition. Let {\nu_1: {\bf N} \rightarrow {\bf R}} be a divisor sum of the form

\displaystyle  \nu_1(n) := \sum_{d_2,\ldots,d_k: d_i|n+h_i \hbox{ for } i=2,\dots,k} \lambda_{d_2,\dots,d_k}

where {\lambda_{d_2,\dots,d_k} = O( \tau(d_2\dots d_k)^{O(1)} )} are coefficients supported on the range {d_2 \dots d_k \leq x^\theta}. Then

\displaystyle  \sum_{n \in I} \tilde F(n+h_1) \nu_1(n) = (\frac{1}{|I|} \sum_{n \in I} \tilde F(n+h_1)) (\sum_{n \in I} \nu_1(n))

\displaystyle  + O( x \log^{-A} x)

for any fixed {A>0}.

Similarly for permutations of the {h_1,\dots,h_k}.

Proof: (Sketch) We can rearrange {\sum_{n \in I} \tilde F(n+h_1) \nu_1(n)} as

\displaystyle  \sum_{d_2,\dots,d_k: d_2 \dots d_k \leq x^\theta} \lambda_{d_2,\dots,d_k} (\sum_{n \in I: n = -h_i\ (d_i) \hbox{ for } i=2,\dots,k} \tilde F(n+h_1) ).

Using the previous proposition, and the Chinese remainder theorem, we may approximate {\sum_{n \in I: n = -h_i\ (d_i) \hbox{ for } i=2,\dots,k} \tilde F(n+h_1)} by

\displaystyle  \frac{1}{\phi([d_2,\dots,d_k])} \sum_{n \in I} \tilde F(n+h_1)

plus negligible errors (here we need the crude bounds on {\lambda_{d_2,\dots,d_k}} and some standard bounds on the divisor function), thus

\displaystyle  \sum_{n \in I} \tilde F(n+h_1) \nu_1(n) = (\sum_{n \in I} \tilde F(n+h_1)) (\sum_{d_2,\dots,d_k} \frac{\lambda_{d_2,\dots,d_k}}{\phi([d_2,\dots,d_k])} + O( x \log^{-A} x).

A similar argument gives

\displaystyle  \sum_{n \in I} \nu_1(n) = |I| (\sum_{d_2,\dots,d_k} \frac{\lambda_{d_2,\dots,d_k}}{\phi([d_2,\dots,d_k])} + O( x \log^{-A} x),

and the claim follows by combining the two assertions. \Box

Next, from Mertens’ theorem one easily verifies that

\displaystyle  \sum_{n \in I} \tilde F(n+h_1) = \delta |I| (\int_{\Delta_{r,\epsilon}} \frac{F(t_1,\dots,t_r)}{t_1 \dots t_r} + o(1))

where {\delta := \frac{W}{\phi(W)\log x}} is the expected density of primes in {I}, and the measure on {\Delta_{r,\epsilon}} is the one induced from Lebesgue measure on the first {r-1} coordinates {t_1,\dots,t_{r-1}}. (One could improve the {o(1)} term to {O(\log^{-A} x)} here by using the prime number theorem, but it isn’t necessary for our analysis.)

Applying (1), we thus have

Corollary 8 Assume {GEH[\theta]} holds for some {0 < \theta < 1}. Let {r\geq 1} and {\epsilon>0} be fixed, let {F: \Delta_{r,\epsilon} \rightarrow {\bf R}} be fixed and smooth, and let {\tilde F} be as in the previous proposition. For {i=2,\dots,k}, let {F_i,G_i: [0,+\infty) \rightarrow {\bf R}} be smooth compactly supported functions with {\sum_{i=2}^k S(F_i)+S(G_i) < \theta}. Then

\displaystyle  \sum_{n \in I} \tilde F(n+h_1) \prod_{i=2}^k \alpha_{F_i}(n+h_i) \alpha_{G_i}(n+h_i) = \delta^k |I| (X_1 \dots X_k + o(1))


\displaystyle  X_1 := \int_{\Delta_{r,\epsilon}} \frac{F(t_1,\dots,t_r)}{t_1\dots t_r}


\displaystyle  X_i := \int_0^\infty F'_i(t) G'_i(t)\ dt

for {i=2,\dots,k}.

— 3. Some integration identities —

Lemma 9 Let {f: [0,a] \rightarrow {\bf R}} be a smooth function. Then

\displaystyle  \int_0^a f'(t)^2\ dt = \frac{1}{a} (\partial^{(a)} f(0))^2 + \int_{t+u \leq a; t,u \geq 0}(\partial^{(u)} f'(t))^2 \frac{dt du}{a}

where {\partial^{(u)} f(t) := f(t+u) - f(t)}.

Proof: Making the change of variables {v := u+t}, the integral {\int_{t+u \leq a; t,u \geq 0} (\partial^{(u)} f'(t))^2 \frac{dt du}{a}} can be written as

\displaystyle  \int_{0 \leq t \leq v \leq a} (f'(v)-f'(t))^2 \frac{dt dv}{a}

which by symmetry is equal to

\displaystyle  \frac{1}{2} \int_0^a \int_0^a (f'(v)-f'(t))^2 \frac{dt dv}{a}

which after expanding out the square and using symmetry is equal to

\displaystyle  \int_0^a f'(t)^2\ dt - \frac{1}{a} (\int_0^a f'(t)\ dt)^2

and the claim follows from the fundamental theorem of calculus. \Box

Iterating this lemma {k} times, we conclude that

\displaystyle  \int_0^a f'(t)^2\ dt = \sum_{i=1}^{k-1} \int_{u_1+\dots+u_i = a; u_1,\dots,u_i \geq 0}

\displaystyle \frac{(\partial^{(u_1)} \dots \partial^{(u_i)} f(0))^2}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})}

\displaystyle  + \int_{u_1+\dots+u_k+t \leq a; u_1,\dots,u_k,t \geq 0} (\partial^{(u_1)} \dots \partial^{(u_k)} f'(t))^2

\displaystyle \frac{dt du_1 \dots du_{k-1}}{a(a-u_1) \dots (a-u_1-\dots-u_{k-1})}

for any {k \geq 1}, where the first integral is integrated using {du_1 \dots du_{i-1}}. In particular, discarding the final term (which is non-negative) and then letting {k \rightarrow \infty}, we obtain the inequality

\displaystyle  \sum_{i=1}^\infty \int_{u_1+\dots+u_i = a; u_1,\dots,u_i \geq 0} \frac{(\partial^{(u_1)} \dots \partial^{(u_i)} f(0))^2}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})} \ \ \ \ \ (6)

\displaystyle  \leq \int_0^a f'(t)^2\ dt.

In fact we have equality:

Proposition 10 Let {f: [0,a] \rightarrow {\bf R}} be smooth. Then

\displaystyle  \sum_{i=1}^\infty \int_{u_1+\dots+u_i = a; u_1,\dots,u_i \geq 0} \frac{(\partial^{(u_1)} \dots \partial^{(u_i)} f(0))^2}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})} \ \ \ \ \ (7)

\displaystyle  = \int_0^a f'(t)^2\ dt.

In particular, by depolarisation we have

\displaystyle  \sum_{i=1}^\infty \int_{u_1+\dots+u_i = a; u_1,\dots,u_i \geq 0} \frac{\partial^{(u_1)} \dots \partial^{(u_i)} f(0) \partial^{(u_1)} \dots \partial^{(u_i)} g(0)}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})} \ \ \ \ \ (8)

\displaystyle  = \int_0^a f'(t)g'(t)\ dt

for smooth {f,g: [0,a] \rightarrow {\bf R}}.

Proof: Let {\epsilon>0} be a small quantity, and write

\displaystyle  X_0 := \int_0^a f'(t)^2\ dt.

From Lemma 9 we have

\displaystyle  X_0 = Y_1 + Z_1 + X_1


\displaystyle  Y_1 := \frac{1}{a} (\partial^{(a)} f(0))^2

\displaystyle  Z_1 := \int_{u_1+t \leq a; 0 \leq u_1 \leq \epsilon; t \geq 0} (\partial^{(u_1)} f'(t))^2 \frac{dt du_1}{a}

\displaystyle  X_1 := \int_{u_1+t \leq a; u_1 \geq \epsilon; t \geq 0} (\partial^{(u_1)} f'(t))^2 \frac{dt du_1}{a}.

From another application of Lemma 9 we have

\displaystyle  X_1 = Y_2 + Z_2 + X_2


\displaystyle  Y_2 := \int_{u_1+u_2= a; u_1 \geq \epsilon; u_2 \geq 0} (\partial^{(u_1)} \partial^{(u_2)} f(0))^2 \frac{du_1}{a(a-u_1)}

\displaystyle  Z_2 := \int_{u_1+u_2+t \leq a; u_1 \geq \epsilon; 0 \leq u_2 \leq \epsilon; t \geq 0} (\partial^{(u_1)} \partial^{(u_2)} f'(t))^2 \frac{dt du_1 du_2}{a(a-u_1)}

\displaystyle  X_2 := \int_{u_1+u_2+t \leq a; u_1,u_2 \geq \epsilon; t \geq 0} (\partial^{(u_1)} \partial^{(u_2)} f'(t))^2 \frac{dt du_1 du_2}{a(a-u_1)}.

Iterating this, we see that

\displaystyle  X_0 = Y_1+\dots+Y_k + Z_1+\dots+Z_k + X_k

for any {k \geq 1}, where

\displaystyle  Y_i := \int_{u_1+\dots+u_i =a; u_1,\dots,u_{i-1} \geq \epsilon; u_i \geq 0} (\partial^{(u_1)} \dots \partial^{(u_i)} f(0))^2

\displaystyle  Z_i := \int_{u_1+\dots+u_i+t \leq a; u_1,\dots,u_{i-1} \geq \epsilon; 0 \leq u_i \leq \epsilon; t \geq 0}

\displaystyle (\partial^{(u_1)} \dots \partial^{(u_i)} f'(t))^2 \frac{dt du_1 \dots du_i}{a(a-u_1)\dots(a-u_1-\dots-u_{i-1})}

\displaystyle  X_i := \int_{u_1+\dots+u_i+t \leq a; u_1,\dots,u_i \geq \epsilon; t \geq 0} (\partial^{(u_1)} \dots \partial^{(u_i)} f'(t))^2

\displaystyle \frac{dt du_1 \dots du_i}{a(a-u_1)\dots(a-u_1-\dots-u_{i-1})}.

If {k > 1/\epsilon}, then {X_k} vanishes, thus

\displaystyle  X_0 = Y_1+\dots+Y_k + Z_1+\dots+Z_k.

For {i \geq 2}, we may rewrite

\displaystyle  Z_i = \int_{u+t \leq a; 0 \leq u\leq \epsilon; t \geq 0} W_i(t,u)\ dt du \ \ \ \ \ (9)


\displaystyle  W_i(t,u) := \int_{u_1+\dots+u_{i-1} \leq a-t-u; u_1,\dots,u_{i-1} \geq \epsilon}

\displaystyle  (\partial^{(u_1)} \dots \partial^{(u_{i-1})} f_{t,u}(0))^2\ \frac{du_1 \dots du_{i-1}}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})}

and {f_{t,u}(s) := \partial^{(u)} f(s+t)}. By Fubini’s theorem, we have

\displaystyle  W_i(t,u) = \int_0^{a-t-u} [\int_{u_1+\dots+u_{i-1} = b; u_1,\dots,u_{i-1} \geq \epsilon}

\displaystyle \frac{(\partial^{(u_1)} \dots \partial^{(u_{i-1})} f_{t,u}(0))^2}{a(a-u_1) \dots (a-u_1-\dots-u_{i-2})}] \frac{db}{a-b}.

Discarding the constraints {u_1,\dots,u_{i-1} \geq \epsilon} and using {a-b \geq u} and {a \geq b}, we conclude that

\displaystyle  W_i(t,u) \leq \frac{1}{u} \int_0^{a-t-u} [\int_{u_1+\dots+u_{i-1} = b}

\displaystyle  \frac{(\partial^{(u_1)} \dots \partial^{(u_{i-1})} f_{t,u}(0))^2}{b(b-u_1) \dots (b-u_1-\dots-u_{i-2})}]\ db.

Summing over {2 \leq i \leq k} using (6), we see that

\displaystyle  \sum_{i=2}^k W_i(t,u) \leq \frac{1}{u} \int_0^{a-t-u} \int_0^b f_{t,u}'(s)^2\ ds db;

since {f_{t,u}' = O_f(1)} by the smoothness of {f}, we conclude that

\displaystyle  \sum_{i=2}^k W_i(t,u) = O_f(1)

and thus by (9)

\displaystyle  \sum_{i=2}^k Z_i = O_f(\epsilon).

Direct computation also shows that {Z_1 = O_f(\epsilon)}, hence

\displaystyle  X_0 \leq Y_1+\dots+Y_k+O_f(\epsilon)

and thus

\displaystyle  X_0 \leq \sum_{i=1}^\infty Y_i + O_f(\epsilon).

But by the monotone convergence theorem, as {\epsilon \rightarrow 0}, {\sum_{i=1}^\infty Y_i} converges to

\displaystyle  \sum_{i=1}^\infty \int_{u_1+\dots+u_i = a; u_1,\dots,u_i \geq 0} \frac{(\partial^{(u_1)} \dots \partial^{(u_i)} f(0))^2}{a(a-u_1) \dots (a-u_1-\dots-u_{i-1})}.

Thus we can complement (6) with the matching upper bound, giving the claim. \Box

We can rewrite the above identity using the following cute identity (which presumably has a name?)

Lemma 11 For any positive reals {t_1,\dots,t_r} with {r \geq 1}, one has

\displaystyle  \frac{1}{t_1 \dots t_r} = \sum_{\sigma \in S_r} \frac{1}{\prod_{i=1}^r \sum_{j=i}^r t_{\sigma(j)}}

where {\sigma} ranges over the permutations of {\{1,\dots,r\}}.

Thus for instance

\displaystyle  \frac{1}{t_1} = \frac{1}{t_1}

\displaystyle  \frac{1}{t_1 t_2} = \frac{1}{(t_1+t_2) t_1} + \frac{1}{(t_1+t_2) t_2}

\displaystyle  \frac{1}{t_1 t_2 t_3} = \frac{1}{(t_1+t_2+t_3)(t_2+t_3)t_3} + \frac{1}{(t_1+t_2+t_3)(t_2+t_3)t_2}

\displaystyle  + \frac{1}{(t_1+t_2+t_3)(t_1+t_3)t_3} + \frac{1}{(t_1+t_2+t_3)(t_1+t_3)t_1}

\displaystyle  + \frac{1}{(t_1+t_2+t_3)(t_1+t_2)t_1} + \frac{1}{(t_1+t_2+t_3)(t_1+t_2)t_2}

and so forth.

Proof: We induct on {r}. The case {r=1} is trivial. If {r>1} and the claim has already been proven for {r-1}, then from induction hypothesis one has

\displaystyle  \sum_{\sigma \in S_r: \sigma(1)=l} \frac{1}{\prod_{i=1}^r \sum_{j=i}^r t_{\sigma(j)}}

\displaystyle  = \frac{1}{(t_1+\dots+t_r) \prod_{1 \leq i \leq r: i \neq l} t_i}

for each {l=1,\dots,r}. Summing over {l}, we obtain the claim. \Box

Proposition 12 Let {f,g: [0,a] \rightarrow {\bf R}} be smooth. Then

\displaystyle  \sum_{i=1}^\infty \int_{u_1+\dots+u_i = a; 0 \leq u_1 \leq \dots \leq u_i \geq 0} \frac{\partial^{(u_1)} \dots \partial^{(u_i)} f(0) \partial^{(u_1)} \dots \partial^{(u_i)} g(0)}{u_1 \dots u_i}

\displaystyle = \int_0^a f'(t) g'(t)\ dt.

Proof: Average (7) over permutations of the {u_1,\dots,u_i} and use Lemma 11. \Box

This gives us a variant of

Corollary 13 Assume {GEH[\theta]} holds for some {0 < \theta < 1}. For {i=1,\dots,k}, let {F_i,G_i: [0,+\infty) \rightarrow {\bf R}} be smooth compactly supported functions with {\sum_{i=2}^k S(F_i)+S(G_i) < \theta}. Then

\displaystyle  \sum_{n \in I} \prod_{i=1}^k \alpha_{F_i}(n+h_i) \alpha_{G_i}(n+h_i) = \delta^k |I| (X_1 \dots X_k + o(1)) \ \ \ \ \ (10)


\displaystyle  X_i := \int_0^1 F'_i(t) G'_i(t)\ dt

for {i=1,\dots,k}.

Proof: Let {\epsilon>0}. By (5), we have

\displaystyle  \sum_{n \in I} \prod_{i=1}^k \alpha_{F_i}(n+h_i) \alpha_{G_i}(n+h_i) 1_{p(n+h_1) \leq x^\epsilon} = O(\epsilon \delta^k |I| )

so by paying a cost of {O(\epsilon \delta^k |I|)}, we may restrict to {n+h_1} which are {x^\epsilon}-rough, and are thus of the form {p_1 \dots p_r} for some {r \leq 1/\epsilon} and {x^\epsilon < p_1 \leq \dots \leq p_r}. For {n+h_1 = p_1 \dots p_r} (restricting to squarefree integers {n+h_1} to avoid technicalities), we have

\displaystyle  \alpha_{F_1}(n+h_1) = \partial^{(\log_{n+h_1} p_1)} \dots \partial^{(\log_{n+h_1}p_r} F_1(0)

and similarly for {\alpha_{G_1}(n+h_1)}. Using this and Corollary 13, we may write the left-hand side of (10) as

\displaystyle  \delta^k |I| ( X_2 \dots X_k \sum_{1 \leq r \leq 1/\epsilon} \int_{\Delta_{r,\epsilon}} \partial^{(t_1)} \dots \partial^{(t_r)} F_1(0) \partial^{(t_1)} \dots \partial^{(t_r)} G_1(0) \frac{dt_1 \dots dt_r}{t_1 \dots t_r}

\displaystyle  + O(\epsilon) + o(1) ).

Sending {\epsilon \rightarrow 0} and using dominated convergence and Proposition 11, we obtain the claim. \Box

Taking linear combinations, we conclude the usual “denominator” asymptotic

\displaystyle  \sum_{n \in I} (\sum_{d_i|n+h_i \forall i=1,\dots,k} \mu(d_1) \dots \mu(d_k) F(\log_x d_1,\dots,\log_x d_k))^2

\displaystyle = \delta^k |I| ( X + o(1) )


\displaystyle  X = \int_{[0,1]^k} F_{1,\dots,k}(t_1,\dots,t_k)^2\ dt_1\dots dt_k

whenever {F: [0,+\infty)^k \rightarrow {\bf R}} is supported on a polytope {R} (not necessarily convex) with

\displaystyle  R+R \subset \bigcup_{i=1}^k \{ (t_1,\dots,t_k): \sum_{j \neq i} t_j < \theta \},

and is a finite linear combination of tensor products {F_1(t_1) \dots F_k(t_k)} of smooth compactly supported functions. We use this as a replacement for the denominator estimate in this previous blog post, we obtain the criteria described above.