You are currently browsing the tag archive for the ‘Hooley Delta function’ tag.

Kevin Ford, Dimitris Koukoulopoulos and I have just uploaded to the arXiv our paper “A lower bound on the mean value of the Erdős-Hooley delta function“. This paper complements the recent paper of Dimitris and myself obtaining the upper bound

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Delta(n) \ll (\log\log x)^{11/4}

on the mean value of the Erdős-Hooley delta function

\displaystyle  \Delta(n) := \sup_u \# \{ d|n: e^u < d \leq e^{u+1} \}

In this paper we obtain a lower bound

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Delta(n) \gg (\log\log x)^{1+\eta-o(1)}

where {\eta = 0.3533227\dots} is an exponent that arose in previous work of result of Ford, Green, and Koukoulopoulos, who showed that

\displaystyle  \Delta(n) \gg (\log\log n)^{\eta-o(1)} \ \ \ \ \ (1)

for all {n} outside of a set of density zero. The previous best known lower bound for the mean value was

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Delta(n) \gg \log\log x,

due to Hall and Tenenbaum.

The point is the main contributions to the mean value of {\Delta(n)} are driven not by “typical” numbers {n} of some size {x}, but rather of numbers that have a splitting

\displaystyle  n = n' n''

where {n''} is the product of primes between some intermediate threshold {1 \leq y \leq x} and {x} and behaves “typically” (so in particular, it has about {\log\log x - \log\log y + O(\sqrt{\log\log x})} prime factors, as per the Hardy-Ramanujan law and the Erdős-Kac law, but {n'} is the product of primes up to {y} and has double the number of typical prime factors – {2 \log\log y + O(\sqrt{\log\log x})}, rather than {\log\log y + O(\sqrt{\log\log x})} – thus {n''} is the type of number that would make a significant contribution to the mean value of the divisor function {\tau(n'')}. Here {y} is such that {\log\log y} is an integer in the range

\displaystyle  \varepsilon\log\log x \leq \log \log y \leq (1-\varepsilon) \log\log x

for some small constant {\varepsilon>0} there are basically {\log\log x} different values of {y} give essentially disjoint contributions. From the easy inequalities

\displaystyle  \Delta(n) \gg \Delta(n') \Delta(n'') \geq \frac{\tau(n')}{\log n'} \Delta(n'') \ \ \ \ \ (2)

(the latter coming from the pigeonhole principle) and the fact that {\frac{\tau(n')}{\log n'}} has mean about one, one would expect to get the above result provided that one could get a lower bound of the form

\displaystyle  \Delta(n'') \gg (\log \log n'')^{\eta-o(1)} \ \ \ \ \ (3)

for most typical {n''} with prime factors between {y} and {x}. Unfortunately, due to the lack of small prime factors in {n''}, the arguments of Ford, Green, Koukoulopoulos that give (1) for typical {n} do not quite work for the rougher numbers {n''}. However, it turns out that one can get around this problem by replacing (2) by the more efficient inequality

\displaystyle  \Delta(n) \gg \frac{\tau(n')}{\log n'} \Delta^{(\log n')}(n'')

where

\displaystyle  \Delta^{(v)}(n) := \sup_u \# \{ d|n: e^u < d \leq e^{u+v} \}

is an enlarged version of {\Delta^{(n)}} when {v \geq 1}. This inequality is easily proven by applying the pigeonhole principle to the factors of {n} of the form {d' d''}, where {d'} is one of the {\tau(n')} factors of {n'}, and {d''} is one of the {\Delta^{(\log n')}(n'')} factors of {n''} in the optimal interval {[e^u, e^{u+\log n'}]}. The extra room provided by the enlargement of the range {[e^u, e^{u+1}]} to {[e^u, e^{u+\log n'}]} turns out to be sufficient to adapt the Ford-Green-Koukoulopoulos argument to the rough setting. In fact we are able to use the main technical estimate from that paper as a “black box”, namely that if one considers a random subset {A} of {[D^c, D]} for some small {c>0} and sufficiently large {D} with each {n \in [D^c, D]} lying in {A} with an independent probability {1/n}, then with high probability there should be {\gg c^{-1/\eta+o(1)}} subset sums of {A} that attain the same value. (Initially, what “high probability” means is just “close to {1}“, but one can reduce the failure probability significantly as {c \rightarrow 0} by a “tensor power trick” taking advantage of Bennett’s inequality.)

Dimitris Koukoulopoulos and I have just uploaded to the arXiv our paper “An upper bound on the mean value of the Erdős-Hooley delta function“. This paper concerns a (still somewhat poorly understood) basic arithmetic function in multiplicative number theory, namely the Erdos-Hooley delta function

\displaystyle  \Delta(n) := \sup_u \Delta(n;u)

where

\displaystyle  \Delta(n;u) := \# \{ d|n: e^u < d \leq e^{u+1} \}.

The function {\Delta} measures the extent to which the divisors of a natural number can be concentrated in a dyadic (or more precisely, {e}-dyadic) interval {(e^u, e^{u+1}]}. From the pigeonhole principle, we have the bounds

\displaystyle  \frac{\tau(n)}{\log n} \ll \Delta(n) \leq \tau(n),

where {\tau(n) := \# \{ d: d|n\}} is the usual divisor function. The statistical behavior of the divisor function is well understood; for instance, if {n} is drawn at random from {1} to {x}, then the mean value of {\tau(n)} is roughly {\log x}, the median is roughly {\log^{\log 2} x}, and (by the Erdős-Kac theorem) {\tau(n)} asymptotically has a log-normal distribution. In particular, there are a small proportion of highly divisible numbers that skew the mean to be significantly higher than the median.

On the other hand, the statistical behavior of the Erdős-Hooley delta function is significantly less well understood, even conjecturally. Again drawing {n} at random from {1} to {x} for large {x}, the median is known to be somewhere between {(\log\log x)^{0.3533\dots}} and {(\log\log x)^{0.6102\dots}} for large {x} – a (difficult) recent result of Ford, Green, and Koukoulopolous (for the lower bound) and La Bretèche and Tenenbaum (for the upper bound). And the mean {\frac{1}{x} \sum_{n \leq x} \Delta(n)} was even less well controlled; the best previous bounds were

\displaystyle  \log \log x \ll \frac{1}{x} \sum_{n \leq x} \Delta(n) \ll \exp( c \sqrt{\log\log x} )

for any {c > \sqrt{2} \log 2}, with the lower bound due to Hall and Tenenbaum, and the upper bound a recent result of La Bretèche and Tenenbaum.

The main result of this paper is an improvement of the upper bound to

\displaystyle  \frac{1}{x} \sum_{n \leq x} \Delta(n) \ll (\log \log x)^{11/4}.

It is still unclear to us exactly what to conjecture regarding the actual order of the mean value.

The reason we looked into this problem was that it was connected to forthcoming work of David Conlon, Jacob Fox, and Huy Pham on the following problem of Erdos: what is the size of the largest subset {A} of {\{1,\dots,N\}} with the property that no non-empty subset of {A} sums to a perfect square? Erdos observed that one can obtain sets of size {\gg N^{1/3}} (basically by considering certain homogeneous arithmetic progressions), and Nguyen and Vu showed an upper bound of {\ll N^{1/3} (\log N)^{O(1)}}. With our mean value bound as input, together with several new arguments, Conlon, Fox, and Pham have been able to improve the upper bound to {\ll N^{1/3} (\log\log N)^{O(1)})}.

Let me now discuss some of the ingredients of the proof. The first few steps are standard. Firstly we may restrict attention to square-free numbers without much difficulty (the point being that if a number {n} factors as {n = d^2 m} with {m} squarefree, then {\Delta(n) \leq \tau(d^2) \Delta(m)}). Next, because a square-free number {n>1} can be uniquely factored as {n = pm} where {p} is a prime and {m} lies in the finite set {{\mathcal S}_{<p}} of squarefree numbers whose prime factors are less than {p}, and {\Delta(n) \leq \tau(p) \Delta(m) = 2 \Delta(m)}, it is not difficult to establish the bound

\displaystyle  \frac{1}{x} \sum_{n \in {\mathcal S}_{<x}} \Delta(n) \ll \sup_{2 \leq y\leq x} \frac{1}{\log y} \sum_{n \in {\mathcal S}_{<y}} \frac{\Delta(n)}{n}.

The upshot of this is that one can replace an ordinary average with a logarithmic average, thus it suffices to show

\displaystyle  \frac{1}{\log x} \sum_{n \in {\mathcal S}_{<x}} \frac{\Delta(n)}{n} \ll (\log \log x)^{11/4}. \ \ \ \ \ (1)

We actually prove a slightly more refined distributional estimate: for any {A \geq 2}, we have a bound

\displaystyle  \Delta(n) \ll A \log^{3/4} A \ \ \ \ \ (2)

outside of an exceptional set {E} which is small in the sense that

\displaystyle  \frac{1}{\log x} \sum_{n \in {\mathcal S}_{<x} x: n \in E} \frac{1}{n} \ll \frac{1}{A}. \ \ \ \ \ (3)

It is not difficult to get from this distributional estimate to the logarithmic average estimate (1) (worsening the exponent {3/4} to {3/4+2 = 11/4}).

To get some intuition on the size of {\Delta(n)}, we observe that if {y > 0} and {n_{<y}} is the factor of {n} coming from the prime factors less than {y}, then

\displaystyle  \Delta(n) \geq \Delta(n_{<y}) \gg \frac{\tau(n_{<y})}{\log n_{<y}}. \ \ \ \ \ (4)

On the other hand, standard estimates let one establish that

\displaystyle  \tau(n_{<y}) \ll A \log n_{<y} \ \ \ \ \ (5)

for all {y}, and all {n} outside of an exceptional set that is small in the sense (3); in fact it turns out that one can also get an additional gain in this estimate unless {\log y} is close to {A^{\log 4}}, which turns out to be useful when optimizing the bounds. So we would like to approximately reverse the inequalities in (4) and get from (5) to (2), possibly after throwing away further exceptional sets of size (3).

At this point we perform another standard technique, namely the moment method of controlling the supremum {\Delta(n) = \sup_u \Delta(n;u)} by the moments

\displaystyle  M_q(n) := \int_{{\bf R}} \Delta(n,u)^q\ du

for natural numbers {q}; it is not difficult to establish the bound

\displaystyle  \Delta(n) \ll M_q(n)^{1/q}

and one expects this bound to become essentially sharp once {q \sim \log\log x}. We will be able to show a moment bound

\displaystyle  \sum_{n \in {\mathcal S}_{<x} \backslash E_q} \frac{M_q(n) / \tau(n)}{n} \leq O(q)^q A^{q-2} \log^{3q/4} A

for any {q \geq 2} for some exceptional set {E_q} obeying the smallness condition (3) (actually, for technical reasons we need to improve the right-hand side slightly to close an induction on {q}); this will imply the distributional bound (2) from a standard Markov inequality argument (setting {q \sim \log\log x}).

The strategy is then to obtain a good recursive inequality for (averages of) {M_q(n)}. As in the reduction to (1), we factor {n=pm} where {p} is a prime and {m \in {\mathcal S}_{<p}}. One observes the identity

\displaystyle  \Delta(n;u) = \Delta(m;u) + \Delta(m;u-\log p)

for any {u}; taking moments, one obtains the identity

\displaystyle  M_q(n) = \sum_{a+b=q; 0 \leq b \leq q} \binom{q}{a} \int_{\bf R} \Delta(m;u)^a \Delta(m;u-\log p)^b\ du.

As in previous literature, one can try to average in {p} here and apply Hölder’s inequality. But it convenient to first use the symmetry of the summand in {a,b} to reduce to the case of relatively small values of {b}:

\displaystyle  M_q(n) \leq 2 \sum_{a+b=q; 0 \leq b \leq q/2} \binom{q}{a} \int_{\bf R} \Delta(m;u)^a \Delta(m;u-\log p)^b\ du.

One can extract out the {b=0} term as

\displaystyle  M_q(n) \leq 2 M_q(m)

\displaystyle + 2 \sum_{a+b=q; 1 \leq b \leq q/2} \binom{q}{a} \int_{\bf R} \Delta(m;u)^a \Delta(m;u-\log p)^b\ du.

It is convenient to eliminate the factor of {2} by dividing out by the divisor function:

\displaystyle  \frac{M_q(n)}{\tau(n)} \leq \frac{M_q(m)}{\tau(m)}

\displaystyle + \frac{1}{m} \sum_{a+b=q; 1 \leq b \leq q/2} \binom{q}{a} \int_{\bf R} \Delta(m;u)^a \Delta(m;u-\log p)^b\ du.

This inequality is suitable for iterating and also averaging in {p} and {m}. After some standard manipulations (using the Brun–Titchmarsh and Hölder inequalities), one is able to estimate sums such as

\displaystyle  \sum_{n \in {\mathcal S}_{<x} \backslash E_q} \frac{M_q(n)/\tau(n)}{n} \ \ \ \ \ (6)

in terms of sums such as

\displaystyle  \int_2^{x^2} \sum_{a+b=q; 1 \leq b \leq q/2} \binom{q}{a} \sum_{n \in {\mathcal S}_{<x} \backslash E_q} \frac{M_a(n) M_b(n)}{\tau(n) n} \frac{dy}{\log^2 y}

(assuming a certain monotonicity property of the exceptional set {E_q} that turns out to hold in our application). By an induction hypothesis and a Markov inequality argument, one can get a reasonable pointwise upper bound on {M_b} (after removing another exceptional set), and the net result is that one can basically control the sum (6) in terms of expressions such as

\displaystyle  \sum_{n \in {\mathcal S}_{<x} \backslash E_a} \frac{M_a(n)/\tau(n)}{n}

for various {a < q}. This allows one to estimate these expressions efficiently by induction.

Archives