There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago of the last of a series of papers by Shinichi Mochizuki which, as one of its major applications, claims to establish this conjecture. It’s still far too early to judge whether this proof is likely to be correct or not (the entire argument encompasses several hundred pages of argument, mostly in the area of anabelian geometry, which very few mathematicians are expert in, to the extent that we still do not even have a full outline of the proof strategy yet), and I don’t have anything substantial to add to the existing discussion around that conjecture. (But, for those that are interested, the Polymath wiki page on the ABC conjecture has collected most of the links to that discussion, and to various background materials.)

In the meantime, though, I thought I might give the standard probabilistic heuristic argument that explains why we expect the ABC conjecture to be true. The underlying heuristic is a common one, used throughout number theory, and it can be summarised as follows:

Heuristic 1 (Probabilistic heuristic)Even though number theory is a deterministic subject (one does not need to roll any dice to factorise a number, or figure out if a number is prime), one expects to get a good asymptotic prediction for the answers to many number-theoretic questions by pretending that various number-theoretic assertions (e.g. that a given number is prime) are probabilistic events (with a probability that can vary between and ) rather than deterministic events (that are either always true or always false). Furthermore:

- (Basic heuristic) If two or more of these heuristically probabilistic events have no obvious reason to be strongly correlated to each other, then we should expect them to behave as if they were (jointly) independent.
- (Advanced heuristic) If two or more of these heuristically probabilistic events have
someobvious correlation between them, but no further correlations are suspected, then we should expect them to behave as if they wereconditionallyindependent, relative to whatever data is causing the correlation.

This is, of course, an extremely vague and completely non-rigorous heuristic, requiring (among other things) a subjective and *ad hoc* determination of what an “obvious reason” is, but in practice it tends to give remarkably plausible predictions, some fraction of which can in fact be backed up by rigorous argument (although in many cases, the actual argument has almost nothing in common with the probabilistic heuristic). A famous special case of this heuristic is the Cramér random model for the primes, but this is not the only such instance for that heuristic.

To give the most precise predictions, one should use the advanced heuristic in Heuristic 1, but this can be somewhat complicated to execute, and so we shall focus instead on the predictions given by the basic heuristic (thus ignoring the presence of some number-theoretic correlations), which tends to give predictions that are quantitatively inaccurate but still reasonably good at the qualitative level.

Here is a basic “corollary” of Heuristic 1:

Heuristic 2 (Heuristic Borel-Cantelli)Suppose one has a sequence of number-theoretic statements, which we heuristically interpet as probabilistic events with probabilities . Suppose also that we know of no obvious reason for these events to have much of a correlation with each other. Then:

- If , we expect only finitely many of the statements to be true. (And if is much smaller than , we in fact expect none of the to be true.)
- If , we expect infinitely many of the statements to be true.

This heuristic is motivated both by the Borel-Cantelli lemma, and by the standard probabilistic computation that if one is given jointly independent, and genuinely probabilistic, events with , then one almost surely has an infinite number of the occuring.

Before we get to the ABC conjecture, let us give two simpler (and well known) demonstrations of these heuristics in action:

Example 1 (Twin prime conjecture)One can heuristically justify the twin prime conjecture as follows. Using the prime number theorem, one can heuristically assign a probability of to the event that any given large integer is prime. In particular, the probability that is prime will then be . Making the assumption that there are no strong correlations between these events, we are led to the prediction that the probability that and aresimultaneouslyprime is . Since , the Borel-Cantelli heuristic then predicts that there should be infinitely many twin primes.Note that the above argument is a bit too naive, because there are some non-trivial correlations between the primality of and the primality of . Most obviously, if is prime, this greatly increases the probability that is odd, which implies that is odd, which then elevates the probability that is prime. A bit more subtly, if is prime, then is likely to avoid the residue class , which means that avoids the residue class , which ends up decreasing the probability that is prime. However, there is a standard way to correct for these local correlations; see for instance in this previous blog post. As it turns out, these local correlations ultimately alter the prediction for the asymptotic density of twin primes by a constant factor (the twin prime constant), but do not affect the qualitative prediction of there being infinitely many twin primes.

Example 2 (Fermat’s last theorem)Let us now heuristically count the number of solutions to for various and natural numbers (which we can reduce to be coprime if desired). We recast this (in the spirit of the ABC conjecture) as , where are powers. The number of powers up to any given number is about , so heuristically any given natural number has a probability about of being an power. If we make the naive assumption that (in the coprime case at least) there is no strong correlation between the events that is an power, is an power, and being an power, then for typical , the probability that are all simultaneously powers would then be . For fixed , the total number of solutions to the Fermat equation would then be predicted to be(Strictly speaking, we need to restrict to the coprime case, but given that a positive density of pairs of integers are coprime, it should not affect the qualitative conclusion significantly if we now omit this restriction.) It might not be immediately obvious as to whether this sum converges or diverges, but (as is often the case with these sorts of unsigned sums) one can clarify the situation by dyadic decomposition. Suppose for instance that we consider the portion of the sum where lies between and . Then this portion of the sum can be controlled by

which simplifies to

Summing in , one thus expects infinitely many solutions for , only finitely many solutions for (indeed, a refinement of this argument shows that one expects only finitely many solutions even if one considers all at once), and a borderline prediction of there being a barely infinite number of solutions when . Here is of course a place where a naive application of the probabilistic heuristic breaks down; there is enough arithmetic structure in the equation that the naive probabilistic prediction ends up being an inaccurate model. Indeed, while this heuristic suggests that a typical homogeneous cubic should have a logarithmic number of integer solutions of a given height , it turns out that some homogeneous cubics (namely, those associated to elliptic curves of positive rank) end up with the bulk of these solutions, while other homogeneous cubics (including those associated to elliptic curves of zero rank, including the Fermat curve ) only get finitely many solutions. The reasons for this are subtle, but certainly the high degree of arithmetic structure present in an elliptic curve (starting with the elliptic curve group law which allows one to generate new solutions from old ones, and which also can be used to exclude solutions to via the method of descent) is a major contributing factor.

Below the fold, we apply similar heuristics to suggest the truth of the ABC conjecture.

** — 1. The ABC conjecture — **

The ABC conjecture asserts that for every , one has

whenever are coprime natural numbers, and is the product of all the primes dividing . Since are coprime, , and an equivalent formulation of the conjecture is as follows: if are real numbers with , then for sufficiently large , there does not exist any solution to the equation with (say), coprime, and

Indeed, one can deduce the former version of the ABC conjecture from the latter by using a finite mesh of triples with spacing equal to small multiple of ; we leave the details to the interested reader.

We can now try to randomly find counterexamples to the abc conjecture as follows:

- Pick a large (say, a power of two).
- Pick coprime squarefree numbers , , .
- Pick numbers with with comparable to .
- Check if .

Steps 1, 2, and 4 are easy to apply probabilistic heuristics to. For Step 3, we need the following lemma, reminiscent of the classical divisor bound:

Lemma 3Let be a square-free integer. Then there are at most integers less than with radical .

*Proof:* Factorise . We are trying to establish the bound , where is the number of numbers less than which are divisible by but by no other primes, that is to say they lie in the set . To do this, it is convenient to work with Dirichlet series, and specifically with the sum

for some parameter to be optimised in later. On the one hand, this expression is at least . On the other hand, we have the factorisation

We crudely bound

for some depending only on , using the trivial bound and the geometric series formula, leading to

On the other hand, since

we see from the prime theorem (which gives a lower bound on the sum of the logarithms of the first primes, which in turn lower bounds ) that

and so

for any fixed ; as can be arbitrarily small, the claim follows.

Now we can run the probabilistic heuristics. After selecting as a large power of two in Step 1, we have choices for in Step 2. By the above lemma, for each , there are choices for that are of magnitude , and assuming (in the case when are coprime) that there is no prior correlation between and , which we think of as being randomly distributed amongst the numbers of size , the probability that should be of order . This leads to a total probability of ; since , this sum is convergent (for ranging over powers of ), so we expect only finitely many counterexamples, thus supporting the ABC conjecture.

Remark 1In the case of , the naive heuristic prediction was incorrect, basically because of the algebraic structure in the Fermat curve (which, among other things, is an elliptic curve and thus enjoys a group law). In the case of the general equation, though, no analogous algebraic structure appears to be present. So there is no obvious correlation here. (Of course, this does not rule out the possibility of a much less obvious correlation; this is one of the reasons why the above arguments are only heuristic, and fall well short of a rigorous proof of anything.)

## 48 comments

Comments feed for this article

18 September, 2012 at 5:01 pm

The probabilistic heuristic justification of the ABC conjecture « Guzman's Mathematics Weblog[…] The probabilistic heuristic justification of the ABC conjecture. Share this:FacebookTwitterRedditStumbleUponTumblrDiggLinkedInPinterestEmailPrintLike this:LikeBe the first to like this. […]

18 September, 2012 at 5:51 pm

Will SawinThe equation x^3=y^3+z^3 is an elliptic curve, as are slight perturbations of it. Some of these curves have finitely many rational points and some have infinitely many. Can we build a probabilistic heuristic that suggests this ambiguity?

18 September, 2012 at 6:11 pm

Terence TaoThis is a very delicate question (essentially that of understanding the rank of a typical elliptic curve), as the objects involved have a very rich algebraic structure, and the crude analytic heuristics given here only give a zeroth order approximation at best; also, deep conjectures such as BSD become relevant. There are a number of surveys on this topic, see e.g. this one by Rubin and Silverberg, or this one by Bektemirov, Mazur, Stein and Watkins.

ADDED LATER: The crude heuristics mentioned in the post, though, do support the fact that a “typical” cubic curves should a logarithmic number of integer points at a given height, or equivalently that a “typical” elliptic curves have a logarithmic number of rational points at a given height. This is consistent with the algebraic heuristics that suggest that the rank of a typical elliptic curve is typically zero or one.

18 September, 2012 at 6:33 pm

Will SawinWell I guess one can say that the group law induces an obvious correlation between the existence of rational points in different regions. Similarly, in the other case where the “actual” algebraic results give (much easier to resolve) ambiguity, which is affine conics, there is also a group law to provide correlation. This “explains” the failure of crude heuristics.

19 September, 2012 at 5:01 am

Wednesday Highlights | Pseudo-Polymath[…] Probability and number theory. […]

19 September, 2012 at 5:01 am

Stones Cry Out - If they keep silent… » Things Heard: e238v3[…] Probability and number theory. […]

19 September, 2012 at 10:31 am

Tao on the ABC Conjecture « Pink Iguana[…] What’s New, The probabilistic heuristic justification of the ABC conjecture, here. There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago […]

19 September, 2012 at 5:36 pm

The abc conjecture and its consequences « Fight with Infinity[…] T.Tao The probabilistic heuristic justification of the ABC conjecture […]

20 September, 2012 at 11:42 pm

William SchlieperI think that there’s a slight error in the Fermat’s Last Theorem derivation for n=3, as the elements of the ring have only two degrees of freedom, not three, as

21 September, 2012 at 8:14 am

Terence TaoUgh, you’re right, that argument is in fact incorrect (even when one takes the factorisation in into account, the probabilistic heuristic still suggests a logarithmic number of solutions). It appears that it is instead the group law for elliptic curves (

~~and for singular cubic curves~~such as ) which is the additional structure that is defeating the probabilistic heuristic (cf. Euler’s proof by descent of FLT for n=3, which can be interpreted in terms of this group law).21 September, 2012 at 6:05 pm

Joe SilvermanAre you using projective coordinates here? If so, then isn’t singular, it’s an elliptic curve. On a singular cubic, the non-singular points form a multiplicative group, so the number of rational points of bounded height grows even more rapidly than on elliptic curves. So it’s not just having a group law, but the kind of group, that determines how badly the heuristic fails. (Sorry for the quibble, this was a very nice post on why ABC should be true.)

[Gah, that was careless of me. Corrected now, thanks – T.]21 September, 2012 at 8:09 am

NonameIf I apply the argument of Example 1 to the case when n, n+2, n+4 are all primes, do I also get the conclusion that there are infinitely many of them? But for any n, at least one of these three numbers must be 0 mod 3, do we get a contradiction?

21 September, 2012 at 8:16 am

Terence TaoYes, this is a failure of the naive Cramer heuristic in which there is no correlation between the primality of numbers such as n, n+2, and n+4, whereas in fact as you point out there are some correlations, in this case coming from the residue classes modulo 3. There is a refined heuristic to take care of this, which is discussed further in a previous blog post ( https://terrytao.wordpress.com/2008/01/07/ams-lecture-structure-and-randomness-in-the-prime-numbers/ ), which was also mentioned at the end of Example 1.

22 September, 2012 at 4:40 am

Américo TavaresCould you please inform how does one read the symbol that appears in the conjecture? I think I understand its meaning but I don’t know

how to read it, even in my native language.

22 September, 2012 at 1:24 pm

RubenMuch Smaller <<

22 September, 2012 at 1:33 pm

Américo TavaresThanks, but how do you read the dependence on ?

23 September, 2012 at 12:51 pm

Américo TavaresI had posted essentially this question on Mathematics Stack Exchange How does one read aloud Vinogradov’s notation and ?

22 September, 2012 at 10:28 am

Envy number theorists! « Noncommutative Analysis[…] learned from some blogs (here, here and here) that the ABC-conjecture might be solved, and also (by extrapolation) that this is a very […]

22 September, 2012 at 10:25 pm

AnonymousTao,

a bit misplaced, I’m sorry, but this may be of interest to you:

Click to access 1208.2473.pdf

Best,

Markus

23 September, 2012 at 2:15 am

DuvallMarkus,

the professor who posed this paper on arxiv also claims to have solved 3 Millenium problems on his web-page. I think the situation is clear now and does not require Terry’s attention.

23 September, 2012 at 10:52 am

Anonymousany given natural number {a} has a probability about {a^{1/n – 1}} of being an {n^{th}} power

—- any natural number no larger than {a} has a probability about {a^{1/n – 1}} of being an {n^{th}} power

23 September, 2012 at 10:54 am

Anonymousany given natural number {a} has a probability about {a^{1/n – 1}} of being an {n^{th}} power

— any given natural number no larger than {a} has a probability about {a^{1/n – 1}} of being an {n^{th}} power

23 September, 2012 at 11:20 am

Terence TaoWell, yes, if you want to stay rigorous and treat the property of being an n^th power as a deterministic event, one has to perform an additional averaging such as the one you suggest. But for the purposes of applying the probabilistic heuristic, it is often simpler to pretend to work with a probabilistic model, where the property of being an n^th power has been replaced by a probabilistic event with a probability consistent with these averaged statistics; see Heuristic 1. While this model is of course inaccurate at the foundational level, the point of the heuristic is that it can still be remarkably accurate at the predictive level, particularly with regards to asymptotic statistics.

23 September, 2012 at 5:03 pm

AnonymousP(for a given n-th power N, a number no larger than N is n-th power)

P(for a given n-th power N, there is a number

asmaller than N is n-th power and N-a is also a N-th power)sum (P(a is n-th power)P(N-a is n-th power)=sum(N^(1/n-1)*N^(1/n-1)=sum(1/(k^(2n-2)) (with N=k^n) so the sum is always smaller than 1 when sum from k=2 and on for any n. When n=2, it is meaningless as we know there are lots of solutions.

This is originated from that P(for a given n-th power N, there is a number

asmaller than N is n-th power and N-a is also a N-th power) is not the same as P(a is n-th power)P(N-a is n-th power)23 September, 2012 at 4:16 pm

Weekly links for September 23 « God plays dice[…] Terry Tao has written a probablistic heuristic justification of the abc conjecture. […]

23 September, 2012 at 10:28 pm

lkafleReblogged this on lava kafle kathmandu nepal.

5 October, 2012 at 3:43 am

Jeff EzearnOne particular intriguing situation that sets FLT for n=3 (as in the pythagorean case n=2) from the higher cases is an old result of Perrin from the late 19th century, that a single solution to that (cubic) fermat curve implies an infinity of other (coprime) solutions.

This reinforces the reason why the heuristic must “turn in” an infinite number of solutions for the cubic case, since (implicitly) it assumes that there’s at least one solution (i.e. some a, b and a+b being simultaneously cubic integers). In that respect (juxtaposed with Perrin’s result), the heuristic is accurate!

5 October, 2012 at 7:08 am

Jeff EzearnAargh, on coming back to read I just realised that I should have said

“… (implicitly) assumes that there’s at least one PROBABLE solution…”

I had written in haste and perhaps I may give further thoughts. One may want to contrast and compare the heuristic for FLT n=3 and n>3 with Perrin’s and Falting’s theorem. i.e.

Theorem (Perrin): If Fermat curve for n=3 has one solution, then there are infinitely many (coprime) solutions

Theorem (after Falting’s): If Fermat curve for n>3 has at least one solution, then there can only be finitely many solutions.

Under these two DEFINITE results, an “accurate” heuristic should run as:

Heuristic (FLT n=3): If FLT n=3, has at least one PROBABLE counterexample (say some a, b and a+b being simultaneously cubic integers with an assumed non-zero probability), then there must be PROBABLY infinitely many other counterexamples.

Heuristic (FLT n>3): If FLT n>3 has at least one PROBABLE counterexample, then there can be PROBABLY only finitely many other counter

7 October, 2012 at 1:30 am

News on the ABC conjecture | MathBlog[…] that I have found two blog posts discussing it, one from Terrence Tao the other from the blog Quomodocumque, and just mentioning Terrence Tao, always seems to put some […]

12 October, 2012 at 7:57 am

MartinNew amazing prime numbers spiral: http://www.youtube.com/watch?v=Dfar_AZub4Y

12 October, 2012 at 3:16 pm

James TomsonBeal Fermat and Pythagora’s Triplets http://www.coolissues.com/mathematics/BealFermatPythagorasTriplets.htm

14 October, 2012 at 5:43 pm

The Chowla conjecture and the Sarnak conjecture « What’s new[…] the amount that an analogous random sum would provide; cf. the probabilistic heuristic discussed in this recent blog post. There are a number of ways to make this heuristic precise. One of these is the following old […]

24 February, 2013 at 2:07 pm

John SalazarWhat’s the current status of Mochizuki’s proof? Does it look like it will fly or have gaps been found that cannot be patched?

12 June, 2013 at 6:08 am

AnonymousFor a list of all known abc triples with quality > 1.4 see:

http://www.math.leidenuniv.nl/~desmit/abc/index.php?set=2

23 April, 2015 at 8:56 am

observerProgress: https://www.maths.nottingham.ac.uk/personal/ibf/symcor.conf.html. What would be the consequences of a person of his pedigree to be wrong?

7 September, 2015 at 1:34 pm

Optimus PrimeWould be interested to see a Terry Tao update post on this theory given this post is from 2012 and the theory is still rumbling on.

29 November, 2015 at 11:21 am

ABC conjecture | fragments[…] A probabilistic heuristic justification for the ABC conjecture can be found at this blog post. […]

8 March, 2016 at 3:25 am

AnonymousIs there any classification of Diophantine equations which (at least in principle) can be solved by an effective version of the ABC conjecture?

16 February, 2017 at 10:50 am

AnonymousOn the Search for a Valid Proof of the famous and important abc Conjecture, please refer to the following link for details:

https://www.researchgate.net/post/What_is_the_proof_of_the_ABC_Conjecture

9 May, 2017 at 7:20 pm

AnonymousIt is possible to formulate the abc conjecture in terms of c alone by defining the following function of c

Where the minimum is over all coprime pairs (a,b) satisfying

Clearly, is well-defined for each .

The abc conjecture is equivalent to the claim that grows faster than any fractional power of . This claim may be refined by making more refined lower bound estimates on the growth rate of .

19 June, 2017 at 3:35 am

AnonymousFor a new approach (based on the theory of Shimura curves) to the abc conjecture and related problems, see Pasten’s paper

Click to access 1705.09251.pdf

9 October, 2018 at 3:01 pm

David ColeFYI: “Searching for a proof of the ABC-conjecture”,

https://www.math10.com/forum/viewtopic.php?f=63&t=1793

15 December, 2020 at 5:23 am

JF22I came across this:

Click to access Explicit%20estimates%20in%20IUTeich.pdf

What’s the thinking about that among the specialists? (especially about the bit on FLT). Google yielded surprisingly little.

11 October, 2021 at 9:19 am

254A, Supplement 4: Probabilistic models and heuristics for the primes (optional) | What's new[…] spheres of application; they are also broadly compatible with the general heuristic (discussed in this previous post) that in the absence of any exploitable structure, asymptotic statistics should default to the most […]

15 January, 2022 at 2:51 am

Luis MartinezI inform you of what appears to be the solution of the constant called the constant of proportionality for the abc conjecture. Once verified, we see that with the defined value the statement of the abc conjecture is always holds .

https://doi.org/10.9734/bpi/ctmcs/v2/9639D

30 December, 2022 at 10:42 am

paulfchristianoYou’ve described the probabilistic heuristic informally, but it seems plausible that there is a formal version. We’ve written an article fleshing out that hope, heavily citing this blog post: https://arxiv.org/abs/2211.06738

If you have time to take a look (or have reactions to the general idea) we’d be very interested in your thoughts.

We are looking for an efficient program E that takes as input a mathematical quantity X and a set of arguments, and then outputs a heuristic estimate E[X] based on those arguments. We don’t expect to define “no obvious reason to be correlated,” instead we just want to make an estimate based on whatever arguments we’ve noticed so far while remaining open to the possibility that our estimate will change later.

It’s hard to precisely define the desiderata for E, but we can evaluate a proposal by seeing how it behaves in concrete cases. For example, E should accept a suitable formalization of the three heuristic arguments given in this post, and should output the expected estimates for the truth of the conjectures and density of solutions. Moreover, those estimates should be pretty stable, even if an adversary is allowed to try to write down arbitrary spurious arguments and give them to E alongside the valid argument.

My personal sense is that this reasoning is systematic enough that it’s fairly likely to have a formalization. And if we did find a formalization, I think it would both be mathematically interesting and have potentially important applications in machine learning.

31 December, 2022 at 7:42 pm

Terence TaoAn interesting direction of inquiry! My guess is that one may have to weaken the desiderata stated in order to actually obtain a program E that resembles human heuristic argument. For instance, the rearrangement invariance/repetition seems highly rational, but actual heuristic reasoning is, I believe “non-commutative” in the sense that the order in which the arguments are presented makes a difference, and repeating an argument may end up increasing its “weight” in the final heuristic. It may end up being more natural to consider non-commutative heuristic models even if they seem irrational (and could be symmetrized by averaging over all permutations); in particular the “respect for proofs” axiom may hinge on where in the ordering the heuristic argument appears in.

2 January, 2023 at 12:19 pm

paulfchristianoInvariance to repetition (and respect for proofs no matter where the proof appears) do seem like very restrictive properties, but unfortunately I suspect they are needed for the ML application. Two reasons for why I’m tentatively optimistic about achieving this even though it’s very strong:

* We don’t yet have any examples of a pair of “incompatible” heuristic arguments where the final estimate depends on what order we consider the arguments or how we weigh them. In every case we’ve found so far, we can compute a combined estimate that incorporates the considerations raised by both arguments. The algorithms for this are not always trivial, e.g. see appendix A.2.3. If any readers can think of candidate examples where merging two arguments is hard (or fundamentally order-dependent) we are very interested in those.

* In cases like the ones in the OP, the heuristic estimates look stable—I suspect there’s a standard of argument validity such that there are no valid arguments that would change our estimate for the asymptotic density of twin primes (and such that if we *did* find a valid argument then it would represent real structure we’d overlooked that should call into question our current estimate for the twin prime density). That kind of stable equilibrium seems order-dependent, so I think there’s at least something order-independent to formalize.