You are currently browsing the tag archive for the ‘probabilistic heuristic’ tag.

There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago of the last of a series of papers by Shinichi Mochizuki which, as one of its major applications, claims to establish this conjecture. It’s still far too early to judge whether this proof is likely to be correct or not (the entire argument encompasses several hundred pages of argument, mostly in the area of anabelian geometry, which very few mathematicians are expert in, to the extent that we still do not even have a full outline of the proof strategy yet), and I don’t have anything substantial to add to the existing discussion around that conjecture. (But, for those that are interested, the Polymath wiki page on the ABC conjecture has collected most of the links to that discussion, and to various background materials.)

In the meantime, though, I thought I might give the standard probabilistic heuristic argument that explains why we expect the ABC conjecture to be true. The underlying heuristic is a common one, used throughout number theory, and it can be summarised as follows:

Heuristic 1 (Probabilistic heuristic) Even though number theory is a deterministic subject (one does not need to roll any dice to factorise a number, or figure out if a number is prime), one expects to get a good asymptotic prediction for the answers to many number-theoretic questions by pretending that various number-theoretic assertions ${E}$ (e.g. that a given number ${n}$ is prime) are probabilistic events (with a probability ${\mathop{\bf P}(E)}$ that can vary between ${0}$ and ${1}$) rather than deterministic events (that are either always true or always false). Furthermore:

• (Basic heuristic) If two or more of these heuristically probabilistic events have no obvious reason to be strongly correlated to each other, then we should expect them to behave as if they were (jointly) independent.
• (Advanced heuristic) If two or more of these heuristically probabilistic events have some obvious correlation between them, but no further correlations are suspected, then we should expect them to behave as if they were conditionally independent, relative to whatever data is causing the correlation.

This is, of course, an extremely vague and completely non-rigorous heuristic, requiring (among other things) a subjective and ad hoc determination of what an “obvious reason” is, but in practice it tends to give remarkably plausible predictions, some fraction of which can in fact be backed up by rigorous argument (although in many cases, the actual argument has almost nothing in common with the probabilistic heuristic). A famous special case of this heuristic is the Cramér random model for the primes, but this is not the only such instance for that heuristic.

To give the most precise predictions, one should use the advanced heuristic in Heuristic 1, but this can be somewhat complicated to execute, and so we shall focus instead on the predictions given by the basic heuristic (thus ignoring the presence of some number-theoretic correlations), which tends to give predictions that are quantitatively inaccurate but still reasonably good at the qualitative level.

Here is a basic “corollary” of Heuristic 1:

Heuristic 2 (Heuristic Borel-Cantelli) Suppose one has a sequence ${E_1, E_2, \ldots}$ of number-theoretic statements, which we heuristically interpet as probabilistic events with probabilities ${\mathop{\bf P}(E_1), \mathop{\bf P}(E_2), \ldots}$. Suppose also that we know of no obvious reason for these events to have much of a correlation with each other. Then:

• If ${\sum_{i=1}^\infty \mathop{\bf P}(E_i) < \infty}$, we expect only finitely many of the statements ${E_n}$ to be true. (And if ${\sum_{i=1}^\infty \mathop{\bf P}(E_i)}$ is much smaller than ${1}$, we in fact expect none of the ${E_n}$ to be true.)
• If ${\sum_{i=1}^\infty \mathop{\bf P}(E_i) = \infty}$, we expect infinitely many of the statements ${E_n}$ to be true.

This heuristic is motivated both by the Borel-Cantelli lemma, and by the standard probabilistic computation that if one is given jointly independent, and genuinely probabilistic, events ${E_1, E_2, \ldots}$ with ${\sum_{i=1}^\infty \mathop{\bf P}(E_i) = \infty}$, then one almost surely has an infinite number of the ${E_i}$ occuring.

Before we get to the ABC conjecture, let us give two simpler (and well known) demonstrations of these heuristics in action:

Example 1 (Twin prime conjecture) One can heuristically justify the twin prime conjecture as follows. Using the prime number theorem, one can heuristically assign a probability of ${1/\log n}$ to the event that any given large integer ${n}$ is prime. In particular, the probability that ${n+2}$ is prime will then be ${1/\log(n+2)}$. Making the assumption that there are no strong correlations between these events, we are led to the prediction that the probability that ${n}$ and ${n+2}$ are simultaneously prime is ${\frac{1}{(\log n)(\log n+2)}}$. Since ${\sum_{n=1}^\infty \frac{1}{(\log n) (\log n+2)} = \infty}$, the Borel-Cantelli heuristic then predicts that there should be infinitely many twin primes.

Note that the above argument is a bit too naive, because there are some non-trivial correlations between the primality of ${n}$ and the primality of ${n+2}$. Most obviously, if ${n}$ is prime, this greatly increases the probability that ${n}$ is odd, which implies that ${n+2}$ is odd, which then elevates the probability that ${n+2}$ is prime. A bit more subtly, if ${n}$ is prime, then ${n}$ is likely to avoid the residue class ${0 \hbox{ mod } 3}$, which means that ${n+2}$ avoids the residue class ${2 \hbox{ mod } 3}$, which ends up decreasing the probability that ${n+2}$ is prime. However, there is a standard way to correct for these local correlations; see for instance in this previous blog post. As it turns out, these local correlations ultimately alter the prediction for the asymptotic density of twin primes by a constant factor (the twin prime constant), but do not affect the qualitative prediction of there being infinitely many twin primes.

Example 2 (Fermat’s last theorem) Let us now heuristically count the number of solutions to ${x^n+y^n=z^n}$ for various ${n}$ and natural numbers ${x,y,z}$ (which we can reduce to be coprime if desired). We recast this (in the spirit of the ABC conjecture) as ${a+b=c}$, where ${a,b,c}$ are ${n^{th}}$ powers. The number of ${n^{th}}$ powers up to any given number ${N}$ is about ${N^{1/n}}$, so heuristically any given natural number ${a}$ has a probability about ${a^{1/n - 1}}$ of being an ${n^{th}}$ power. If we make the naive assumption that (in the coprime case at least) there is no strong correlation between the events that ${a}$ is an ${n^{th}}$ power, ${b}$ is an ${n^{th}}$ power, and ${a+b}$ being an ${n^{th}}$ power, then for typical ${a,b}$, the probability that ${a,b,a+b}$ are all simultaneously ${n^{th}}$ powers would then be ${a^{1/n-1} b^{1/n-1} (a+b)^{1/n-1}}$. For fixed ${n}$, the total number of solutions to the Fermat equation would then be predicted to be

$\displaystyle \sum_{a=1}^\infty \sum_{b=1}^\infty a^{1/n-1} b^{1/n-1} (a+b)^{1/n-1}.$

(Strictly speaking, we need to restrict to the coprime case, but given that a positive density of pairs of integers are coprime, it should not affect the qualitative conclusion significantly if we now omit this restriction.) It might not be immediately obvious as to whether this sum converges or diverges, but (as is often the case with these sorts of unsigned sums) one can clarify the situation by dyadic decomposition. Suppose for instance that we consider the portion of the sum where ${c=a+b}$ lies between ${2^k}$ and ${2^{k+1}}$. Then this portion of the sum can be controlled by

$\displaystyle \sum_{a \leq 2^{k+1}} \sum_{b \leq 2^{k+1}} a^{1/n-1} b^{1/n-1} O( ( 2^k )^{1/n - 1} )$

which simplifies to

$\displaystyle O( 2^{(3/n - 1)k} ).$

Summing in ${k}$, one thus expects infinitely many solutions for ${n=2}$, only finitely many solutions for ${n>3}$ (indeed, a refinement of this argument shows that one expects only finitely many solutions even if one considers all ${n>3}$ at once), and a borderline prediction of there being a barely infinite number of solutions when ${n=3}$. Here is of course a place where a naive application of the probabilistic heuristic breaks down; there is enough arithmetic structure in the equation ${x^3+y^3=z^3}$ that the naive probabilistic prediction ends up being an inaccurate model. Indeed, while this heuristic suggests that a typical homogeneous cubic should have a logarithmic number of integer solutions of a given height ${N}$, it turns out that some homogeneous cubics (namely, those associated to elliptic curves of positive rank) end up with the bulk of these solutions, while other homogeneous cubics (including those associated to elliptic curves of zero rank, including the Fermat curve ${x^3+y^3=z^3}$) only get finitely many solutions. The reasons for this are subtle, but certainly the high degree of arithmetic structure present in an elliptic curve (starting with the elliptic curve group law which allows one to generate new solutions from old ones, and which also can be used to exclude solutions to ${x^3+y^3=z^3}$ via the method of descent) is a major contributing factor.

Below the fold, we apply similar heuristics to suggest the truth of the ABC conjecture.