Let {\tau(n) := \sum_{d|n} 1} be the divisor function. A classical application of the Dirichlet hyperbola method gives the asymptotic

\displaystyle \sum_{n \leq x} \tau(n) \sim x \log x

where {X \sim Y} denotes the estimate {X = (1+o(1))Y} as {x \rightarrow \infty}. Much better error estimates are possible here, but we will not focus on the lower order terms in this discussion. For somewhat idiosyncratic reasons I will interpret this estimate (and the other analytic number theory estimates discussed here) through the probabilistic lens. Namely, if {{\bf n} = {\bf n}_x} is a random number selected uniformly between {1} and {x}, then the above estimate can be written as

\displaystyle {\bf E} \tau( {\bf n} ) \sim \log x, \ \ \ \ \ (1)


that is to say the random variable {\tau({\bf n})} has mean approximately {\log x}. (But, somewhat paradoxically, this is not the median or mode behaviour of this random variable, which instead concentrates near {\log^{\log 2} x}, basically thanks to the Hardy-Ramanujan theorem.)

Now we turn to the pair correlations {\sum_{n \leq x} \tau(n) \tau(n+h)} for a fixed positive integer {h}. There is a classical computation of Ingham that shows that

\displaystyle \sum_{n \leq x} \tau(n) \tau(n+h) \sim \frac{6}{\pi^2} \sigma_{-1}(h) x \log^2 x, \ \ \ \ \ (2)



\displaystyle \sigma_{-1}(h) := \sum_{d|h} \frac{1}{d}.

The error term in (2) has been refined by many subsequent authors, as has the uniformity of the estimates in the {h} aspect, as these topics are related to other questions in analytic number theory, such as fourth moment estimates for the Riemann zeta function; but we will not consider these more subtle features of the estimate here. However, we will look at the next term in the asymptotic expansion for (2) below the fold.

Using our probabilistic lens, the estimate (2) can be written as

\displaystyle {\bf E} \tau( {\bf n} ) \tau( {\bf n} + h ) \sim \frac{6}{\pi^2} \sigma_{-1}(h) \log^2 x. \ \ \ \ \ (3)


From (1) (and the asymptotic negligibility of the shift by {h}) we see that the random variables {\tau({\bf n})} and {\tau({\bf n}+h)} both have a mean of {\sim \log x}, so the additional factor of {\frac{6}{\pi^2} \sigma_{-1}(h)} represents some arithmetic coupling between the two random variables.

Ingham’s formula can be established in a number of ways. Firstly, one can expand out {\tau(n) = \sum_{d|n} 1} and use the hyperbola method (splitting into the cases {d \leq \sqrt{x}} and {n/d \leq \sqrt{x}} and removing the overlap). If one does so, one soon arrives at the task of having to estimate sums of the form

\displaystyle \sum_{n \leq x: d|n} \tau(n+h)

for various {d \leq \sqrt{x}}. For {d} much less than {\sqrt{x}} this can be achieved using a further application of the hyperbola method, but for {d} comparable to {\sqrt{x}} things get a bit more complicated, necessitating the use of non-trivial estimates on Kloosterman sums in order to obtain satisfactory control on error terms. A more modern approach proceeds using automorphic form methods, as discussed in this previous post. A third approach, which unfortunately is only heuristic at the current level of technology, is to apply the Hardy-Littlewood circle method (discussed in this previous post) to express (2) in terms of exponential sums {\sum_{n \leq x} \tau(n) e(\alpha n)} for various frequencies {\alpha}. The contribution of “major arc” {\alpha} can be computed after a moderately lengthy calculation which yields the right-hand side of (2) (as well as the correct lower order terms that are currently being suppressed), but there does not appear to be an easy way to show directly that the “minor arc” contributions are of lower order, although the methods discussed previously do indirectly show that this is ultimately the case.

Each of the methods outlined above requires a fair amount of calculation, and it is not obvious while performing them that the factor {\frac{6}{\pi^2} \sigma_{-1}(h)} will emerge at the end. One can at least explain the {\frac{6}{\pi^2}} as a normalisation constant needed to balance the {\sigma_{-1}(h)} factor (at a heuristic level, at least). To see this through our probabilistic lens, introduce an independent copy {{\bf n}'} of {{\bf n}}, then

\displaystyle {\bf E} \tau( {\bf n} ) \tau( {\bf n}' ) = ({\bf E} \tau ({\bf n}))^2 \sim \log^2 x; \ \ \ \ \ (4)


using symmetry to order {{\bf n}' > {\bf n}} (discarding the diagonal case {{\bf n} = {\bf n}'}) and making the change of variables {{\bf n}' = {\bf n}+h}, we see that (4) is heuristically consistent with (3) as long as the asymptotic mean of {\frac{6}{\pi^2} \sigma_{-1}(h)} in {h} is equal to {1}. (This argument is not rigorous because there was an implicit interchange of limits present, but still gives a good heuristic “sanity check” of Ingham’s formula.) Indeed, if {{\bf E}_h} denotes the asymptotic mean in {h}, then we have (heuristically at least)

\displaystyle {\bf E}_h \sigma_{-1}(h) = \sum_d {\bf E}_h \frac{1}{d} 1_{d|h}

\displaystyle = \sum_d \frac{1}{d^2}

\displaystyle = \frac{\pi^2}{6}

and we obtain the desired consistency after multiplying by {\frac{6}{\pi^2}}.

This still however does not explain the presence of the {\sigma_{-1}(h)} factor. Intuitively it is reasonable that if {h} has many prime factors, and {{\bf n}} has a lot of factors, then {{\bf n}+h} will have slightly more factors than average, because any common factor to {h} and {{\bf n}} will automatically be acquired by {{\bf n}+h}. But how to quantify this effect?

One heuristic way to proceed is through analysis of local factors. Observe from the fundamental theorem of arithmetic that we can factor

\displaystyle \tau(n) = \prod_p \tau_p(n)

where the product is over all primes {p}, and {\tau_p(n) := \sum_{p^j|n} 1} is the local version of {\tau(n)} at {p} (which in this case, is just one plus the {p}valuation {v_p(n)} of {n}: {\tau_p = 1 + v_p}). Note that all but finitely many of the terms in this product will equal {1}, so the infinite product is well-defined. In a similar fashion, we can factor

\displaystyle \sigma_{-1}(h) = \prod_p \sigma_{-1,p}(h)


\displaystyle \sigma_{-1,p}(h) := \sum_{p^j|h} \frac{1}{p^j}

(or in terms of valuations, {\sigma_{-1,p}(h) = (1 - p^{-v_p(h)-1})/(1-p^{-1})}). Heuristically, the Chinese remainder theorem suggests that the various factors {\tau_p({\bf n})} behave like independent random variables, and so the correlation between {\tau({\bf n})} and {\tau({\bf n}+h)} should approximately decouple into the product of correlations between the local factors {\tau_p({\bf n})} and {\tau_p({\bf n}+h)}. And indeed we do have the following local version of Ingham’s asymptotics:

Proposition 1 (Local Ingham asymptotics) For fixed {p} and integer {h}, we have

\displaystyle {\bf E} \tau_p({\bf n}) \sim \frac{p}{p-1}


\displaystyle {\bf E} \tau_p({\bf n}) \tau_p({\bf n}+h) \sim (1-\frac{1}{p^2}) \sigma_{-1,p}(h) (\frac{p}{p-1})^2

\displaystyle = \frac{p+1}{p-1} \sigma_{-1,p}(h)

From the Euler formula

\displaystyle \prod_p (1-\frac{1}{p^2}) = \frac{1}{\zeta(2)} = \frac{6}{\pi^2}

we see that

\displaystyle \frac{6}{\pi^2} \sigma_{-1}(h) = \prod_p (1-\frac{1}{p^2}) \sigma_{-1,p}(h)

and so one can “explain” the arithmetic factor {\frac{6}{\pi^2} \sigma_{-1}(h)} in Ingham’s asymptotic as the product of the arithmetic factors {(1-\frac{1}{p^2}) \sigma_{-1,p}(h)} in the (much easier) local Ingham asymptotics. Unfortunately we have the usual “local-global” problem in that we do not know how to rigorously derive the global asymptotic from the local ones; this problem is essentially the same issue as the problem of controlling the minor arc contributions in the circle method, but phrased in “physical space” language rather than “frequency space”.

Remark 2 The relation between the local means {\sim \frac{p}{p-1}} and the global mean {\sim \log^2 x} can also be seen heuristically through the application

\displaystyle \prod_{p \leq x^{1/e^\gamma}} \frac{p}{p-1} \sim \log x

of Mertens’ theorem, where {1/e^\gamma} is Pólya’s magic exponent, which serves as a useful heuristic limiting threshold in situations where the product of local factors is divergent.

Let us now prove this proposition. One could brute-force the computations by observing that for any fixed {j}, the valuation {v_p({\bf n})} is equal to {j} with probability {\sim \frac{p-1}{p} \frac{1}{p^j}}, and with a little more effort one can also compute the joint distribution of {v_p({\bf n})} and {v_p({\bf n}+h)}, at which point the proposition reduces to the calculation of various variants of the geometric series. I however find it cleaner to proceed in a more recursive fashion (similar to how one can prove the geometric series formula by induction); this will also make visible the vague intuition mentioned previously about how common factors of {{\bf n}} and {h} force {{\bf n}+h} to have a factor also.

It is first convenient to get rid of error terms by observing that in the limit {x \rightarrow \infty}, the random variable {{\bf n} = {\bf n}_x} converges vaguely to a uniform random variable {{\bf n}_\infty} on the profinite integers {\hat {\bf Z}}, or more precisely that the pair {(v_p({\bf n}_x), v_p({\bf n}_x+h))} converges vaguely to {(v_p({\bf n}_\infty), v_p({\bf n}_\infty+h))}. Because of this (and because of the easily verified uniform integrability properties of {\tau_p({\bf n})} and their powers), it suffices to establish the exact formulae

\displaystyle {\bf E} \tau_p({\bf n}_\infty) = \frac{p}{p-1} \ \ \ \ \ (5)



\displaystyle {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) = (1-\frac{1}{p^2}) \sigma_{-1,p}(h) (\frac{p}{p-1})^2 = \frac{p+1}{p-1} \sigma_{-1,p}(h) \ \ \ \ \ (6)


in the profinite setting (this setting will make it easier to set up the recursion).

We begin with (5). Observe that {{\bf n}_\infty} is coprime to {p} with probability {\frac{p-1}{p}}, in which case {\tau_p({\bf n}_\infty)} is equal to {1}. Conditioning to the complementary probability {\frac{1}{p}} event that {{\bf n}_\infty} is divisible by {p}, we can factor {{\bf n}_\infty = p {\bf n}'_\infty} where {{\bf n}'_\infty} is also uniformly distributed over the profinite integers, in which event we have {\tau_p( {\bf n}_\infty ) = 1 + \tau_p( {\bf n}'_\infty )}. We arrive at the identity

\displaystyle {\bf E} \tau_p({\bf n}_\infty) = \frac{p-1}{p} + \frac{1}{p} ( 1 + {\bf E} \tau_p( {\bf n}'_\infty ) ).

As {{\bf n}_\infty} and {{\bf n}'_\infty} have the same distribution, the quantities {{\bf E} \tau_p({\bf n}_\infty)} and {{\bf E} \tau_p({\bf n}'_\infty)} are equal, and (5) follows by a brief amount of high-school algebra.

We use a similar method to treat (6). First treat the case when {h} is coprime to {p}. Then we see that with probability {\frac{p-2}{p}}, {{\bf n}_\infty} and {{\bf n}_\infty+h} are simultaneously coprime to {p}, in which case {\tau_p({\bf n}_\infty) = \tau_p({\bf n}_\infty+h) = 1}. Furthermore, with probability {\frac{1}{p}}, {{\bf n}_\infty} is divisible by {p} and {{\bf n}_\infty+h} is not; in which case we can write {{\bf n} = p {\bf n}'} as before, with {\tau_p({\bf n}_\infty) = 1 + \tau_p({\bf n}'_\infty)} and {\tau_p({\bf n}_\infty+h)=1}. Finally, in the remaining event with probability {\frac{1}{p}}, {{\bf n}+h} is divisible by {p} and {{\bf n}} is not; we can then write {{\bf n}_\infty+h = p {\bf n}'_\infty}, so that {\tau_p({\bf n}_\infty+h) = 1 + \tau_p({\bf n}'_\infty)} and {\tau_p({\bf n}_\infty) = 1}. Putting all this together, we obtain

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) = \frac{p-2}{p} + 2 \frac{1}{p} (1 + {\bf E} \tau_p({\bf n}'_\infty))

and the claim (6) in this case follows from (5) and a brief computation (noting that {\sigma_{-1,p}(h)=1} in this case).

Now suppose that {h} is divisible by {p}, thus {h=ph'} for some integer {h'}. Then with probability {\frac{p-1}{p}}, {{\bf n}_\infty} and {{\bf n}_\infty+h} are simultaneously coprime to {p}, in which case {\tau_p({\bf n}_\infty) = \tau_p({\bf n}_\infty+h) = 1}. In the remaining {\frac{1}{p}} event, we can write {{\bf n}_\infty = p {\bf n}'_\infty}, and then {\tau_p({\bf n}_\infty) = 1 + \tau_p({\bf n}'_\infty)} and {\tau_p({\bf n}_\infty+h) = 1 + \tau_p({\bf n}'_\infty+h')}. Putting all this together we have

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) = \frac{p-1}{p} + \frac{1}{p} {\bf E} (1+\tau_p({\bf n}'_\infty)(1+\tau_p({\bf n}'_\infty+h)

which by (5) (and replacing {{\bf n}'_\infty} by {{\bf n}_\infty}) leads to the recursive relation

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) = \frac{p+1}{p-1} + \frac{1}{p} {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h)

and (6) then follows by induction on the number of powers of {p}.

The estimate (2) of Ingham was refined by Estermann, who obtained the more accurate expansion

\displaystyle \sum_{n \leq x} \tau(n) \tau(n+h) = \frac{6}{\pi^2} \sigma_{-1}(h) x \log^2 x + a_1(h) x \log x + a_2(h) x \ \ \ \ \ (7)


\displaystyle + O( x^{11/12+o(1)} )

for certain complicated but explicit coefficients {a_1(h), a_2(h)}. For instance, {a_1(h)} is given by the formula

\displaystyle a_1(h) = (\frac{12}{\pi^2} (2\gamma-1) + 4 a') \sigma_{-1}(h) - \frac{24}{\pi^2} \sigma'_{-1}(h)

where {\gamma} is the Euler-Mascheroni constant,

\displaystyle a' := - \sum_{r=1}^\infty \frac{\mu(r)}{r^2} \log r, \ \ \ \ \ (8)



\displaystyle \sigma'_{-1}(h) := \sum_{d|h} \frac{\log d}{d}.

The formula for {a_2(h)} is similar but even more complicated. The error term {O( x^{11/12+o(1)})} was improved by Heath-Brown to {O( x^{5/6+o(1)})}; it is conjectured (for instance by Conrey and Gonek) that one in fact has square root cancellation {O( x^{1/2+o(1)})} here, but this is well out of reach of current methods.

These lower order terms are traditionally computed either from a Dirichlet series approach (using Perron’s formula) or a circle method approach. It turns out that a refinement of the above heuristics can also predict these lower order terms, thus keeping the calculation purely in physical space as opposed to the “multiplicative frequency space” of the Dirichlet series approach, or the “additive frequency space” of the circle method, although the computations are arguably as messy as the latter computations for the purposes of working out the lower order terms. We illustrate this just for the {a_1(h) x \log x} term below the fold.

— 1. More refined heuristics —

To begin with we revisit the heuristic justification of (2) and modify it a bit, in the spirit of the “modified Cramér models” from this previous blog post. Our arguments here will be even less rigorous than in the preceding section, for instance we shall be quite cavalier with the use of the {\approx} symbol without precisely quantifying error terms.

For this section, it will be convenient to set {{\bf n} = {\bf n}_x} not to be the uniform distribution from {1} to {x} as before, but rather the uniform distribution from {x} to {(1+\delta) x} for some fairly small {\delta>0} (e.g. {\delta = 1/\log^{10} x} will suffice). The reason for this is that we need to start keeping track of the magnitude of {{\bf n}} to a relative accuracy of at least {1/\log x}, and with this new choice of {{\bf n}} the magnitude is simply {x} (up to this accuracy). By “differentiating” asymptotics such as (7) in {h} we see that

\displaystyle \sum_{x < n \leq (1+\delta) x} \tau(n) \tau(n+h) \approx \delta x ( \frac{6}{\pi^2} \sigma_{-1}(h) \log^2 x + a'_1(h) \log x )

where we ignore terms of lower order than the {\log x} term and

\displaystyle a'_1(h) := a_1(h) + 2 \times \frac{6}{\pi^2} \sigma_{-1}(h)

\displaystyle = (\frac{24 \gamma}{\pi^2} + 4 a') \sigma_{-1}(h) - \frac{24}{\pi^2} \sigma'_{-1}(h)

and so our task is now to heuristically justify the statement

\displaystyle {\bf E} \tau({\bf n}_x) \tau({\bf n}_x + h) \approx \frac{6}{\pi^2} \sigma_{-1}(h) \log^2 x + a'_1(h) \log x. \ \ \ \ \ (9)


Let us begin by revisiting the derivation of the simpler asymptotic

\displaystyle {\bf E} \tau({\bf n}_x) \tau({\bf n}_x + h) \approx \frac{6}{\pi^2} \sigma_{-1}(h) \log^2 x \ \ \ \ \ (10)


without passing immediately to local factors, but instead using a modified Cramér type model that will be more amenable to keeping careful track of the magnitude {x}. For any square-free {q}, write

\displaystyle \tau_q(n) := \prod_{p|q} \tau_p(n)


\displaystyle \tau_{(q)}(n) := \prod_{p \not | q} \tau_p(n)

so that we have the factorisation {\tau(n) = \tau_q(n) \tau_{(q)}(n)}. From the Chinese remainder theorem we expect {\tau_q({\bf n}_x)} and {\tau_{(q)}({\bf n}_x)} to behave approximately independently, at least when {q} is fixed and {x} is large.

We can statistics of {\tau} to deduce statistics of {\tau_{(q)}}. This is traditionally done using the device of Möbius inversion, but we will use a recursive probabilistic approach here similar to that used to prove Proposition 1. Let’s start with understanding {\tau_{(p)}} for some prime {p}. With probability about {\frac{p-1}{p}}, {{\bf n}_x} is coprime to {p}, so that {\tau({\bf n}_x) = \tau_{(p)}({\bf n}_x)}. In the remaining event of probability about {\frac{1}{p}}, we may write {{\bf n}_x = p {\bf n}'_{x/p}} where {{\bf n}'_{x/p}} is uniformly distributed near {x/p}, and then {\tau({\bf n}_x) = \tau({\bf n}'_{x/p}) + \tau_{(p)}({\bf n}'_{x/p})}. This leads to the approximate identity

\displaystyle {\bf E} \tau({\bf n}_x) \approx \frac{p-1}{p} {\bf E}(\tau_{(p)}({\bf n}_x) | ({\bf n}_x,p)=1) \ \ \ \ \ (11)


\displaystyle + \frac{1}{p}( {\bf E} \tau({\bf n}'_{x/p}) + \tau_{(p)}({\bf n}'_{x/p})).

To work with this equation we make two further heuristic assumptions, which turn out to be reasonable at the level of accuracy required for justifying (10), but not for (9). Firstly, we assume that the distribution of {\tau_{(p)}({\bf n}_x)} is largely unchanged after conditioning to the event {({\bf n}_x,p)=1}, which is plausible from the Chinese remainder theorem. Secondly, we assume that the distribution of {\tau({\bf n}'_{x/p})} is largely the same as that of {\tau({\bf n}_x)}, and similarly for {\tau_{(p)}}. With these assumptions, (11) simplifies to

\displaystyle {\bf E} \tau({\bf n}_x) \approx \frac{p-1}{p} {\bf E} \tau_{(p)}({\bf n}_x) + \frac{1}{p}( {\bf E} \tau({\bf n}_x) + \tau_{(p)}({\bf n}_x))

which rearranges as

\displaystyle {\bf E} \tau_{(p)}({\bf n}_x) \approx \frac{p-1}{p} {\bf E} \tau({\bf n}_x) \ \ \ \ \ (12)


(note that this is consistent with Proposition 1 and the presumed independence of {\tau_p({\bf n}_x)} and {\tau_{(p)}({\bf n}_x)}). Iterating this argument gives

\displaystyle {\bf E} \tau_{(q)}({\bf n}_x) \approx (\prod_{p|q} \frac{p-1}{p}) {\bf E} \tau({\bf n}_x)

for any fixed squarefree {q}. Applying (1) we conclude

\displaystyle {\bf E} \tau_{(q)}({\bf n}_x) \approx (\prod_{p|q} \frac{p-1}{p}) \log x. \ \ \ \ \ (13)


Now let {w} be a moderately large number (growing very slowly with {x}, e.g. {w = \log\log\log x}), and let {W := \prod_{p \leq w} p} be the primorial of {w}. We factor

\displaystyle {\bf E} \tau({\bf n}_x) \tau({\bf n}+h) = {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \tau_{(W)}({\bf n}_x) \tau_{(W)}({\bf n}_x+h). \ \ \ \ \ (14)


There is arithmetic coupling between the {\tau_W({\bf n})} and {\tau_W({\bf n}+h)} factors; indeed from Proposition 1 and the Chinese remainder theorem we have

\displaystyle {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \approx \prod_{p \leq w} (1-\frac{1}{p^2}) \sigma_{-1,p}(h) (\frac{p}{p-1})^2. \ \ \ \ \ (15)


Meanwhile, from (13) we have

\displaystyle {\bf E} \tau_{(W)}({\bf n}_x) \approx (\prod_{p \leq w} \frac{p-1}{p}) \log x


\displaystyle {\bf E} \tau_{(W)}({\bf n}_x+h) \approx (\prod_{p \leq w} \frac{p-1}{p}) \log x

In analogy with the modified Cramér model for the primes, we now assume that the random variables {\tau_W({\bf n}_x) \tau_W({\bf n}_x+h)}, {\tau_{(W)}({\bf n}_x)}, and {\tau_{(W)}({\bf n}_x+h)} behave like independent random variables; this turns out to be sufficiently accurate for predicting (10), but not (9). This leads to the prediction

\displaystyle {\bf E} \tau({\bf n}_x) \tau({\bf n}_x+h)\approx (\prod_{p \leq w} (1-\frac{1}{p^2}) \sigma_{-1,p}(h)) \log^2 x,

and if {w} goes to infinity with {x} then we recover (10).

We now repeat the above analysis but expanding to the next order in {\log x}, which will require keeping better track of the {x} parameter. The Dirichlet hyperbola method actually gives the refinement

\displaystyle \sum_{n \leq x} \tau(n) = x \log x - x + 2 \gamma x + O( \sqrt{x} ),

and hence

\displaystyle \sum_{x < n \leq (1+\delta) x} \tau(n) \approx \delta x ( \log x + 2 \gamma )

so that

\displaystyle {\bf E} \tau({\bf n}_x) \approx \log x + 2 \gamma. \ \ \ \ \ (16)


We previously computed the expectation {\mathop{\bf E} \tau_{(q)}({\bf n}_x)}. Now that we want to be more accurate, it turns out that it is better to compute the slightly different conditional expectation {\mathop{\bf E}(\tau_{(q)}({\bf n}_x) | ({\bf n}_x,q)=1)}. Now we compute the mean of {\tau_{(q)}({\bf n}_x)} a bit more carefully. Again we begin with {\tau_{(p)}({\bf n}_x)} for {p} prime. From (11) and (16), and by replacing {{\bf n}'_{x/p}} with {{\bf n}_{x/p}}, we have

\displaystyle \log x + 2 \gamma \approx \frac{p-1}{p} {\bf E}(\tau_{(p)}({\bf n}_x) | ({\bf n}_x,p)=1) + \frac{1}{p}( \log \frac{x}{p} + 2 \gamma + \tau_{(p)}({\bf n}_{x/p})). \ \ \ \ \ (17)


Previously we simplified this equation by assuming that the conditioning {({\bf n}_x,p)=1} had negligible impact on {\tau_{(p)}({\bf n}_x)}, and that {\tau_{(p)}({\bf n}_{x/p})} had similar distribution to {\tau_{(p)}({\bf n}_{x})}. As it turns out, we can no longer afford to assume these simplifying assumptions at this level of accuracy, and must argue more carefully. With probability about {\frac{p-1}{p}} {{\bf n}_x} is coprime to {p}, and in the remaining event of probability about {\frac{1}{p}}, we have {{\bf n}_x = p {\bf n}'_{x/p}} and {\tau_{(p)}({\bf n}_x) = \tau_{(p)}( {\bf n}'_{x/p} )}. This gives the approximate identity

\displaystyle {\bf E} \tau_{(p)}({\bf n}_x) \approx \frac{p-1}{p} {\bf E}(\tau_{(p)}({\bf n}_x) | ({\bf n}_x,p)=1 ) + \frac{1}{p} {\bf E} \tau_{(p)}({\bf n}_{x/p}). \ \ \ \ \ (18)


If we refine the approximation (12) to give the ansatz

\displaystyle {\bf E}(\tau_{(p)}({\bf n}_x) | ({\bf n}_x,p)=1 ) \approx \frac{p-1}{p} (\log x + 2 \gamma + A)


\displaystyle {\bf E} \tau_{(p)}({\bf n}_x) \approx \frac{p-1}{p} (\log x + 2 \gamma + B)

for some constants {A,B} to be worked out momentarily, the above two equations give after some computation (and canceling out all the {\log x + 2\gamma} terms

\displaystyle 0 = \frac{(p-1)^2}{p^2} A - \frac{(2p-1)\log p}{p^2} + \frac{p-1}{p^2} B

\displaystyle B = \frac{p-1}{p} A + \frac{1}{p} B - \frac{\log p}{p}

which can be solved as

\displaystyle A = \frac{2 \log p}{p-1}; \quad B = \frac{\log p}{p-1}.

Thus we have

\displaystyle {\bf E}(\tau_{(p)}({\bf n}_x) | ({\bf n}_x,p)=1 ) \approx \frac{p-1}{p} (\log x + 2 \gamma + \frac{2 \log p}{p-1}).

Iterating this we arrive at

\displaystyle {\bf E}(\tau_{(q)}({\bf n}_x) | ({\bf n}_x,q)=1 ) \approx (\prod_{p|q} \frac{p-1}{p}) (\log x + 2 \gamma + \sum_{p|q} \frac{2 \log p}{p-1}). \ \ \ \ \ (19)


It is instructive (albeit lengthy) to deduce this asymptotic rigorously for fixed square-free {q} and large {x} by using Möbius inversion and the hyperbola method; one can also proceed a bit more quickly by using Perron’s formula and inspecting the pole of the relevant Dirichlet series at {s=1}. (Not surprisingly, similar calculations appear when computing major arc contributions to (2).)

Now we are ready to predict the value of {{\bf E} \tau({\bf n}_x) \tau({\bf n}_x + h)}. Our starting point is again the expansion (14), except that we now strip out the factors within {\tau_{(W)}} that divide {W}, in order to use (19). More precisely, we adopt the factorisation {n = n_W n_{(W)}} of arbitrary natural numbers {n} by the formulae

\displaystyle n_W := \prod_{p|W} p^{v_p(n)}; \quad n_{(W)} := \prod_{p \not | W} p^{v_p(n)}

and observe that

\displaystyle \tau_{(W)}({\bf n}_x) = \tau_{(W)}( ({\bf n}_x)_{(W)} )

and similarly

\displaystyle \tau_{(W)}({\bf n}_x+h) = \tau_{(W)}( ({\bf n}_x+h)_{(W)} ).

We are now faced with computing the expression

\displaystyle {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \tau_{(W)}(({\bf n}_x)_{(W)}) \tau_{(W)}(({\bf n}_x+h)_{(W)}). \ \ \ \ \ (20)


Previously we had assumed that the factors {\tau_W({\bf n}_x) \tau_W({\bf n}_x+h)}, {\tau_{(W)}(({\bf n}_x)_{(W)})}, and {\tau_{(W)}(({\bf n}_x+h)_{(W)})} behaved independently. However this is not quite true at the current level of accuracy, because the magnitude of the natural numbers {({\bf n}_x)_{(W)}} and {({\bf n}_x+h)_{(W)}} are influenced by the size of {({\bf n}_x)_W} and {({\bf n}_x+h)_W} respectively, and these in turn are certainly correlated with {\tau_W({\bf n}_x) \tau_W({\bf n}_x+h)}. Because of the non-trivial dependence on {x} on the right-hand side of (19) at this level of accuracy, the size of {({\bf n}_x)_W} and {({\bf n}_x+h)_W} influences the expected value of {\tau_{(W)}(({\bf n}_x)_{(W)})} and {\tau_{(W)}(({\bf n}_x+h)_{(W)})}, causing a coupling between the three factors of (20). However, if we first condition the magnitude of {({\bf n}_x)_{W}} and {({\bf n}_x+h)_{W}} to be fixed, then we expect the three factors to be conditionally independent (at this level of accuracy). Heuristically, with this conditioning, {({\bf n}_x)_{(W)}} behaves like a random element of size about {x / ({\bf n}_x)_W} that is coprime to {W}, and hence by (19)

\displaystyle {\bf E}(\tau_{(W)}(({\bf n}_x)_W) | ({\bf n}_x)_{W}, ({\bf n}_x+h)_{W})

\displaystyle \approx (\prod_{p \leq w} \frac{p-1}{p}) (\log \frac{x}{({\bf n}_x)_W} + 2 \gamma + \sum_{p \leq w} \frac{2 \log p}{p-1})

and similarly

\displaystyle {\bf E}(\tau_{(W)}(({\bf n}_x+h)_W) | ({\bf n}_x)_{W}, ({\bf n}_x+h)_{W})

\displaystyle \approx (\prod_{p \leq w} \frac{p-1}{p}) (\log \frac{x}{({\bf n}_x+h)_W} + 2 \gamma + \sum_{p \leq w} \frac{2 \log p}{p-1}).

We can thus heuristically estimate (20) by

\displaystyle {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h)

\displaystyle \times (\prod_{p \leq w} \frac{p-1}{p}) (\log \frac{x}{({\bf n}_x)_W} + 2 \gamma + \sum_{p \leq w} \frac{2 \log p}{p-1})

\displaystyle \times (\prod_{p \leq w} \frac{p-1}{p}) (\log \frac{x}{({\bf n}_x+h)_W} + 2 \gamma + \sum_{p \leq w} \frac{2 \log p}{p-1}).

This can be expanded as

\displaystyle a_0 \log^2 x + a_1 \log x + a_2


\displaystyle a_0 := (\prod_{p \leq w} \frac{p-1}{p})^2 {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h)

\displaystyle a_1 := 2 (2 \gamma + \sum_{p \leq w} \frac{2 \log p}{p-1}) a_0

\displaystyle - (\prod_{p \leq w} \frac{p-1}{p})^2 {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \log({\bf n}_x)_W

\displaystyle - (\prod_{p \leq w} \frac{p-1}{p})^2 {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \log({\bf n}_x+h)_W

and {a_2} is something complicated that we do not compute here. Using (15) we have

\displaystyle a_0 \approx \frac{6}{\pi^2} \sigma_{-1}(h)

which recovers the Ingham estimate (10). To obtain the finer approximation (9) we need to show that

\displaystyle a_1 \approx a'_1(h).

It will suffice to obtain the asymptotic

\displaystyle {\bf E} \tau_W({\bf n}_x) \tau_W({\bf n}_x+h) \log({\bf n}_x)_W \ \ \ \ \ (21)


\displaystyle \approx \prod_{p \leq w} (1-\frac{1}{p^2}) \sigma_{-1,p}(h) (\frac{p}{p-1})^2 \times 2 (\sum_{p \leq w} \frac{\log p}{p-1}- \frac{\pi^2}{6} a')

\displaystyle + \prod_{p \leq w}(1-\frac{1}{p^2}) (\frac{p}{p-1})^2 \times 2 \sigma'_{-1}(h)

together with the analogue of (21) with {\log({\bf n}_x)_W} replaced by {\log({\bf n}_x+h)_W}. (One in fact has the approximation {\sum_{p \leq w} \frac{\log p}{p-1} \approx \log w - \gamma}, which reflects the Laurent approximation {-\frac{\zeta'(s)}{\zeta(s)} \approx \frac{1}{s-1} - \gamma} near {s=1}, but we will not need this.)

We will just prove (21), as the analogue of (21) for {\log({\bf n}_x+h)_W} is proven similarly. As this is a local calculation we may replace {{\bf n}_x} by {{\bf n}_\infty}. Splitting

\displaystyle \log({\bf n}_\infty)_W = \sum_{p \leq w} \log({\bf n}_\infty)_p

we can write the left-hand side of (21) as a sum over primes,

\displaystyle \sum_p {\bf E} \tau_{W/p}({\bf n}_\infty) \tau_{W/p}({\bf n}_\infty+h) \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p.

The {W/p} and {p} terms decouple, so by Proposition 1 this becomes

\displaystyle \sum_p (\prod_{p' \leq w: p' \neq p} (1-\frac{1}{(p')^2}) \sigma_{-1,p'}(h) (\frac{p'}{p'-1})^2 ) {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p.

Extracting a common factor of {(\prod_{p' \leq w} (1-\frac{1}{(p')^2}) \sigma_{-1,p'}(h) (\frac{p'}{p'-1})^2 )}, we reduce to showing that

\displaystyle \sum_{p \leq w} (1-\frac{1}{p^2})^{-1} \sigma_{-1,p}(h)^{-1} (\frac{p-1}{p})^2 {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p

\displaystyle \approx 2 (\sum_{p \leq w} \frac{\log p}{p-1}- \frac{\pi^2}{6} a')

\displaystyle + 2 \frac{\sigma'_{-1}(h)}{\sigma_{-1}(h)}.

The left-hand side is a sum over primes {p}, so we work on making the right-hand side a sum over primes as well. This is achieved by the following two lemmas:

Lemma 3 We have {\frac{\pi^2}{6} a' = \sum_p \frac{\log p}{p^2-1}}.

Proof: While this identity can be proven elementarily, it is fastest to proceed using Dirichlet series. From Taylor expansion and (8) we have

\displaystyle \frac{1}{\zeta(s)} = \sum_n \frac{\mu(n)}{n^s}

\displaystyle \approx \frac{1}{\zeta(2)} + (s-2) a'

\displaystyle = \frac{6}{\pi^2} ( 1 + \frac{\pi^2}{6} a' (s-2) )

for {s} close to {2}. Inverting, we have

\displaystyle \zeta(s) \approx \frac{\pi^2}{6} ( 1 - \frac{\pi^2}{6} a' (s-2) )

and hence

\displaystyle - \frac{\zeta'(2)}{\zeta(2)} = \frac{\pi^2}{6} a'.

The left-hand side is {\sum_n \frac{\Lambda(n)}{n^2} = \sum_p \frac{\log p}{p^2-1}}, giving the claim. \Box

Lemma 4 We have {\frac{\sigma'_{-1}(h)}{\sigma_{-1}(h)} = \sum_p \frac{\sigma'_{-1,p}(h)}{\sigma_{-1,p}(h)}}.

Proof: Again it is fastest to proceed using Dirichlet series. Setting {\sigma_s(h) := \sum_{d|h} \frac{1}{d^s}} and {\sigma_{s,p}(h) := \sum_{p^k|h} \frac{1}{p^{ks}}}, we have from the fundamental theorem of arithmetic that

\displaystyle \log \sigma_s(h) = \sum_p \log \sigma_{s,p}(h).

Differentiating this at {s=-1} we obtain the claim. \Box

We remark that {\sigma'_{-1}} is an example of a derived multiplicative function, discussed in this previous blog post.

In view of these expansions, we can reduce matters to establishing a purely local estimate at a single prime {p}, namely that

\displaystyle (1-\frac{1}{p^2})^{-1} \sigma_{-1,p}(h)^{-1} (\frac{p-1}{p})^2 {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p

\displaystyle = 2 (\frac{\log p}{p-1}- \frac{\log p}{p^2-1})

\displaystyle + 2 \frac{\sigma'_{-1,p}(h)}{\sigma_{-1,p}(h)},

which rearranges as

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p = \frac{2p \log p}{(p-1)^2} \sigma_{-1,p}(h) + \frac{2(p+1)}{p-1} \sigma'_{-1,p}(h) \ \ \ \ \ (22)


(compare with Proposition 1).

We need two preliminary estimates to handle lower order terms:

Lemma 5 We have

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \log({\bf n}_\infty)_p = \frac{2p \log p}{(p-1)^2}.

Proof: Once again we use recursion. With probability {\frac{p-1}{p}}, {{\bf n}_\infty} is coprime to {p}, so {\tau_p({\bf n}_\infty) \log({\bf n}_\infty)_p} vanishes. Otherwise, {{\bf n}_\infty = p {\bf n}'_\infty}, {\tau_p({\bf n}_\infty) = 1 + \tau_p({\bf n}'_\infty)}, and {\log ({\bf n}_\infty)_p = \log p + \log ({\bf n}'_\infty)_p}. We conclude that

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \log({\bf n}_\infty)_p = \frac{1}{p} {\bf E} (1+\tau_p({\bf n}_\infty)) (\log p + \log({\bf n}_\infty)_p).

From Proposition 1 we have

\displaystyle {\bf E} \tau_p({\bf n}_\infty) = \frac{p}{p-1}

and from the identity {\log ({\bf n}_\infty)_p = \log p ( \tau_p({\bf n}_\infty) - 1)} we thus have

\displaystyle {\bf E} \log ({\bf n}_\infty)_p = \frac{1}{p-1} \log p \ \ \ \ \ (23)


so that

\displaystyle {\bf E} \tau_p({\bf n}_\infty) \log({\bf n}_\infty)_p = \frac{1}{p} {\bf E} \tau_p({\bf n}_\infty)) \log({\bf n}_\infty)_p) + \frac{2 \log p}{p-1}

and the claim follows. \Box

Lemma 6 We have

\displaystyle {\bf E} \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p =\frac{\log p}{p-1} ((p+1)\sigma_{-1,p}(h) - p).

Note that one can view Lemma 5 as the {h=0} (p-adic) limiting case of Lemma 6.

Proof: First suppose that {h} is coprime to {p}. Then {\tau_p({\bf n}_\infty+h)} is equal to {1} whenever {\log({\bf n}_\infty)_p} is non-zero, so the claim follows from (23) in this case. Now suppose that {h = ph'} for some integer {h'}.

When {h} is coprime to {p} the left-hand side is equal to {0} when {{\bf n}_\infty} is coprime to {p} and {1} otherwise, giving the claim. Now suppose that {h=ph'} for some integer {h'}. With probability {\frac{p-1}{p}}, {{\bf n}_\infty} is coprime to {p}, so {\tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p} vanishes; otherwise,{{\bf n}_\infty = p {\bf n}'_\infty}, {\tau_p({\bf n}_\infty+h) = 1 + \tau_p({\bf n}'_\infty+h')}, and {\log ({\bf n}_\infty)_p = \log p + \log ({\bf n}'_\infty)_p}. We conclude that

\displaystyle {\bf E} \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p = \frac{1}{p} {\bf E} (1+\tau_p({\bf n}_\infty+h')) (\log p + \log({\bf n}_\infty)_p).

Arguing as before we conclude that

\displaystyle {\bf E} \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p = \frac{1}{p} {\bf E} \tau_p({\bf n}_\infty+h')) \log({\bf n}_\infty)_p) + \frac{2 \log p}{p-1}

and the claim follows by induction on the number of times {p} divides {h} (noting that {\sigma_{-1,p}(h) = 1 + \frac{1}{p} \sigma_{-1,p}(h')}). \Box

Now we can prove (22). First suppose {h} is coprime to {p}. In this case, the random variable {\tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p} is only non-zero when {{\bf n}_\infty} is divisible by {p}, which forces {\tau_p({\bf n}_\infty+h)} to equal {1}. By Lemma 5, the left-hand side of (22) is thus equal to {\frac{2p \log p}{(p-1)^2}}, thus proving (22) in this case.

Now suppose that {h} is divisible by {p}. With probability {\frac{p-1}{p}}, {{\bf n}_\infty} is coprime to {p}, so {\tau_p({\bf n}_\infty) \tau_p({\bf n}_\infty+h) \log({\bf n}_\infty)_p} vanishes. Otherwise, we write {{\bf n}_\infty = p {\bf n}'_\infty} and observe that

\displaystyle \tau_p({\bf n}_\infty) = 1 + \tau_p({\bf n}'_\infty)

\displaystyle \tau_p({\bf n}_\infty+h) = 1 + \tau_p({\bf n}'_\infty+h')

\displaystyle \log({\bf n}_\infty)_p = \log p + \log({\bf n}'_\infty)_p

thus the left-hand side of (22) expands as

\displaystyle \frac{1}{p} {\bf E} (1 + \tau_p({\bf n}'_\infty))(1 + \tau_p({\bf n}'_\infty+h'))(\log p + \log({\bf n}'_\infty)_p).

This expands into eight terms that can be computed using Proposition 1, (23), and Lemmas 5, (6) as

\displaystyle \frac{1}{p} \log p

\displaystyle + \frac{1}{p} \frac{1}{p-1} \log p

\displaystyle + \frac{1}{p} \frac{p}{p-1} \log p

\displaystyle + \frac{1}{p} \frac{\log p}{p-1} ((p+1)\sigma_{-1,p}(h') - p)

\displaystyle + \frac{1}{p} \frac{p}{p-1} \log p

\displaystyle + \frac{1}{p} \frac{2p \log p}{(p-1)^2}

\displaystyle + \frac{1}{p} \frac{p+1}{p-1} \sigma_{-1,p}(h') \log p

\displaystyle + \frac{1}{p} {\bf E} \tau_p({\bf n}_\infty)\tau_p({\bf n}_\infty+h')\log({\bf n}_\infty)_p.

The claim then follows by induction on the number of times {p} divides {h}, together with a tedious computation using the identities

\displaystyle \sigma_{-1,p}(h) = 1 + \frac{1}{p} \sigma_{-1,p}(h')


\displaystyle \sigma'_{-1,p}(h)= \frac{\log p}{p} \sigma_{-1,p}(h') + \frac{1}{p} \sigma'_{-1,p}(h').