Let be the divisor function. A classical application of the Dirichlet hyperbola method gives the asymptotic

where denotes the estimate as . Much better error estimates are possible here, but we will not focus on the lower order terms in this discussion. For somewhat idiosyncratic reasons I will interpret this estimate (and the other analytic number theory estimates discussed here) through the probabilistic lens. Namely, if is a random number selected uniformly between and , then the above estimate can be written as

that is to say the random variable has mean approximately . (But, somewhat paradoxically, this is not the median or mode behaviour of this random variable, which instead concentrates near , basically thanks to the Hardy-Ramanujan theorem.)

Now we turn to the pair correlations for a fixed positive integer . There is a classical computation of Ingham that shows that

The error term in (2) has been refined by many subsequent authors, as has the uniformity of the estimates in the aspect, as these topics are related to other questions in analytic number theory, such as fourth moment estimates for the Riemann zeta function; but we will not consider these more subtle features of the estimate here. However, we will look at the next term in the asymptotic expansion for (2) below the fold.

Using our probabilistic lens, the estimate (2) can be written as

From (1) (and the asymptotic negligibility of the shift by ) we see that the random variables and both have a mean of , so the additional factor of represents some arithmetic coupling between the two random variables.

Ingham’s formula can be established in a number of ways. Firstly, one can expand out and use the hyperbola method (splitting into the cases and and removing the overlap). If one does so, one soon arrives at the task of having to estimate sums of the form

for various . For much less than this can be achieved using a further application of the hyperbola method, but for comparable to things get a bit more complicated, necessitating the use of non-trivial estimates on Kloosterman sums in order to obtain satisfactory control on error terms. A more modern approach proceeds using automorphic form methods, as discussed in this previous post. A third approach, which unfortunately is only heuristic at the current level of technology, is to apply the Hardy-Littlewood circle method (discussed in this previous post) to express (2) in terms of exponential sums for various frequencies . The contribution of “major arc” can be computed after a moderately lengthy calculation which yields the right-hand side of (2) (as well as the correct lower order terms that are currently being suppressed), but there does not appear to be an easy way to show directly that the “minor arc” contributions are of lower order, although the methods discussed previously do indirectly show that this is ultimately the case.

Each of the methods outlined above requires a fair amount of calculation, and it is not obvious while performing them that the factor will emerge at the end. One can at least explain the as a normalisation constant needed to balance the factor (at a heuristic level, at least). To see this through our probabilistic lens, introduce an independent copy of , then

using symmetry to order (discarding the diagonal case ) and making the change of variables , we see that (4) is heuristically consistent with (3) as long as the asymptotic mean of in is equal to . (This argument is not rigorous because there was an implicit interchange of limits present, but still gives a good heuristic “sanity check” of Ingham’s formula.) Indeed, if denotes the asymptotic mean in , then we have (heuristically at least)

and we obtain the desired consistency after multiplying by .

This still however does not explain the presence of the factor. Intuitively it is reasonable that if has many prime factors, and has a lot of factors, then will have slightly more factors than average, because any common factor to and will automatically be acquired by . But how to quantify this effect?

One heuristic way to proceed is through analysis of local factors. Observe from the fundamental theorem of arithmetic that we can factor

where the product is over all primes , and is the local version of at (which in this case, is just one plus the –valuation of : ). Note that all but finitely many of the terms in this product will equal , so the infinite product is well-defined. In a similar fashion, we can factor

where

(or in terms of valuations, ). Heuristically, the Chinese remainder theorem suggests that the various factors behave like independent random variables, and so the correlation between and should approximately decouple into the product of correlations between the local factors and . And indeed we do have the following local version of Ingham’s asymptotics:

Proposition 1 (Local Ingham asymptotics)For fixed and integer , we haveand

From the Euler formula

we see that

and so one can “explain” the arithmetic factor in Ingham’s asymptotic as the product of the arithmetic factors in the (much easier) local Ingham asymptotics. Unfortunately we have the usual “local-global” problem in that we do not know how to rigorously derive the global asymptotic from the local ones; this problem is essentially the same issue as the problem of controlling the minor arc contributions in the circle method, but phrased in “physical space” language rather than “frequency space”.

Remark 2The relation between the local means and the global mean can also be seen heuristically through the applicationof Mertens’ theorem, where is Pólya’s magic exponent, which serves as a useful heuristic limiting threshold in situations where the product of local factors is divergent.

Let us now prove this proposition. One could brute-force the computations by observing that for any fixed , the valuation is equal to with probability , and with a little more effort one can also compute the joint distribution of and , at which point the proposition reduces to the calculation of various variants of the geometric series. I however find it cleaner to proceed in a more recursive fashion (similar to how one can prove the geometric series formula by induction); this will also make visible the vague intuition mentioned previously about how common factors of and force to have a factor also.

It is first convenient to get rid of error terms by observing that in the limit , the random variable converges vaguely to a uniform random variable on the profinite integers , or more precisely that the pair converges vaguely to . Because of this (and because of the easily verified uniform integrability properties of and their powers), it suffices to establish the exact formulae

in the profinite setting (this setting will make it easier to set up the recursion).

We begin with (5). Observe that is coprime to with probability , in which case is equal to . Conditioning to the complementary probability event that is divisible by , we can factor where is also uniformly distributed over the profinite integers, in which event we have . We arrive at the identity

As and have the same distribution, the quantities and are equal, and (5) follows by a brief amount of high-school algebra.

We use a similar method to treat (6). First treat the case when is coprime to . Then we see that with probability , and are simultaneously coprime to , in which case . Furthermore, with probability , is divisible by and is not; in which case we can write as before, with and . Finally, in the remaining event with probability , is divisible by and is not; we can then write , so that and . Putting all this together, we obtain

and the claim (6) in this case follows from (5) and a brief computation (noting that in this case).

Now suppose that is divisible by , thus for some integer . Then with probability , and are simultaneously coprime to , in which case . In the remaining event, we can write , and then and . Putting all this together we have

which by (5) (and replacing by ) leads to the recursive relation

and (6) then follows by induction on the number of powers of .

The estimate (2) of Ingham was refined by Estermann, who obtained the more accurate expansion

for certain complicated but explicit coefficients . For instance, is given by the formula

where is the Euler-Mascheroni constant,

The formula for is similar but even more complicated. The error term was improved by Heath-Brown to ; it is conjectured (for instance by Conrey and Gonek) that one in fact has square root cancellation here, but this is well out of reach of current methods.

These lower order terms are traditionally computed either from a Dirichlet series approach (using Perron’s formula) or a circle method approach. It turns out that a refinement of the above heuristics can also predict these lower order terms, thus keeping the calculation purely in physical space as opposed to the “multiplicative frequency space” of the Dirichlet series approach, or the “additive frequency space” of the circle method, although the computations are arguably as messy as the latter computations for the purposes of working out the lower order terms. We illustrate this just for the term below the fold.

## Recent Comments