I’ve been focusing my blog posts recently on the mathematics around Hilbert’s fifth problem (is every locally Euclidean group a Lie group?), but today, I will be discussing another of Hilbert’s problems, namely Hilbert’s seventh problem, on the transcendence of powers of two algebraic numbers . (I am not randomly going through Hilbert’s list, by the way; I hope to explain my interest in the seventh problem in a later post.) This problem was famously solved by Gelfond and Schneider in the 1930s:

Theorem 1 (Gelfond-Schneider theorem)Let be algebraic numbers, with and irrational. Then (any of the values of the possibly multi-valued expression) is transcendental.

For sake of simplifying the discussion, let us focus on just one specific consequence of this theorem:

Corollary 2is transcendental.

*Proof:* If not, one could obtain a contradiction to the Gelfond-Schneider theorem by setting and . (Note that is clearly irrational, since for any integers with positive.)

In the 1960s, Alan Baker established a major generalisation of the Gelfond-Schneider theorem known as Baker’s theorem, as part of his work in transcendence theory that later earned him a Fields Medal. Among other things, this theorem provided explicit quantitative bounds on exactly *how* transcendental quantities such as were. In particular, it gave a strong bound on how *irrational* such quantities were (i.e. how easily they were approximable by rationals). Here, in particular, is one special case of Baker’s theorem:

Proposition 3 (Special case of Baker’s theorem)For any integers with positive, one hasfor some absolute (and effectively computable) constants .

This theorem may be compared with (the easily proved) Liouville’s theorem on diophantine approximation, which asserts that if is an irrational algebraic number of degree , then

for all with positive, and some effectively computable , and (the more significantly difficult) Thue-Siegel-Roth theorem, which under the same hypotheses gives the bound

for all , all with positive and an *ineffective* constant . Finally, one should compare these results against Dirichlet’s theorem on Diophantine approximation, which asserts that for any real number one has

for infinitely many with positive. (The reason the Thue-Siegel-Roth theorem is ineffective is because it relies heavily on the dueling conspiracies argument, i.e. playing off multiple “conspiracies” against each other; the other results however only focus on one approximation at a time and thus avoid ineffectivity.)

Proposition 3 easily implies the following separation property between powers of and powers of :

Corollary 4 (Separation between powers of and powers of )For any positive integers one hasfor some effectively computable constants (which may be slightly different from those in Proposition 3).

Indeed, this follows quickly from Proposition 3, the identity

and some elementary estimates.

In particular, the gap between powers of three and powers of two grows exponentially in the exponents . I do not know of any other way to establish this fact other than essentially going through some version of Baker’s argument (which will be given below).

For comparison, by exploiting the trivial (yet fundamental) *integrality gap* – the obvious fact that if an integer is non-zero, then its magnitude is at least – we have the trivial bound

for all positive integers (since, from the fundamental theorem of arithmetic, cannot vanish). Putting this into (1) we obtain a very weak version of Proposition 3, that only gives exponential bounds instead of polynomial ones:

Proposition 5 (Trivial bound)For any integers with positive, one hasfor some absolute (and effectively computable) constant .

The proof of Baker’s theorem (or even of the simpler special case in Proposition 3) is largely elementary (except for some very basic complex analysis), but is quite intricate and lengthy, as a lot of careful book-keeping is necessary in order to get a bound as strong as that in Proposition 3. To illustrate the main ideas, I will prove a bound that is weaker than Proposition 3, but still significantly stronger than Proposition 5, and whose proof already captures many of the key ideas of Baker:

Proposition 6 (Weak special case of Baker’s theorem)For any integers with , one hasfor some absolute constants .

Note that Proposition 3 is equivalent to the assertion that one can take (and effective) in the above proposition.

The proof of Proposition 6 can be made effective (for instance, it is not too difficult to make the close to ); however, in order to simplify the exposition (and in particular, to use some nonstandard analysis terminology to reduce the epsilon management), I will establish Proposition 6 with ineffective constants .

Like many other results in transcendence theory, the proof of Baker’s theorem (and Proposition 6) rely on what we would nowadays call the *polynomial method* – to play off upper and lower bounds on the complexity of polynomials that vanish (or nearly vanish) to high order on a specified set of points. (I have discussed the polynomial method in relation to problems in incidence geometry in several previous blog posts.) In the specific case of Proposition 6, the points in question are of the form

for some large integer . On the one hand, the irrationality of ensures that the curve

is not algebraic, and so it is difficult for a polynomial of controlled complexity to vanish (or nearly vanish) to high order at all the points of ; the trivial bound in Proposition 5 allows one to make this statement more precise. (Here, “complexity” of a polynomial is an informal term referring both to the degree of the polynomial, and the height of the coefficients, which in our application will essentially be integers up to some normalisation factors.) On the other hand, if Proposition 6 failed, then is close to a rational, which by Taylor expansion makes close to an algebraic curve over the rationals (up to some rescaling by factors such as and ) at each point of . This, together with a pigeonholing argument, allows one to find a polynomial of reasonably controlled complexity to (nearly) vanish to high order at every point of .

These observations, by themselves, are not sufficient to get beyond the trivial bound in Proposition 5. However, Baker’s key insight was to exploit the integrality gap to bootstrap the (near) vanishing of on a set to imply near-vanishing of on a larger set with . The point is that if a polynomial of controlled degree and size (nearly) vanishes to higher order on a lot of points on an analytic curve such as , then it will also be fairly small on many other points in as well. (To quantify this statement efficiently, it is convenient to use the tools of complex analysis, which are particularly well suited to understand zeroes (or small values) of polynomials.) But then, thanks to the integrality gap (and the controlled complexity of ), we can amplify “fairly small” to “very small”.

Using this observation and an iteration argument, Baker was able to take a polynomial of controlled complexity that nearly vanished to high order on a relatively small set , and bootstrap that to show near-vanishing on a much larger set . This bootstrap allows one to dramatically bridge the gap between the upper and lower bounds on the complexity of polynomials that nearly vanish to a specified order on a given , and eventually leads to Proposition 6 (and, with much more care and effort, to Proposition 3).

Below the fold, I give the details of this argument. My treatment here is inspired by this expose of Serre, and these lecture notes of Soundararajan (as transcribed by Ian Petrow).

** — 1. Nonstandard formulation — **

The proof of Baker’s theorem requires a lot of “epsilon management” in that one has to carefully choose a lot of parameters such as and in order to make the argument work properly. This is particularly the case if one wants a good value of exponents in the final result, such as the quantity in Proposition 6. To simplify matters, we will abandon all attempts to get good values of constants anywhere, which allows one to retreat to the nonstandard analysis setting where the notation is much cleaner, and much (though not all) of the epsilon management is eliminated. This is a relatively mild use of nonstandard analysis, though, and it is not difficult to turn all the arguments below into standard effective arguments (but at the cost of explicitly tracking all the constants ). See for instance the notes of Soundararajan for such an effective treatment.

We turn to the details. We will assume some basic familiarity with nonstandard analysis, as covered for instance in this previous blog post (but one should be able to follow this argument using only non-rigorous intuition of what terms such as “unbounded” or “infinitesimal” mean).

Let be an unbounded (nonstandard) positive real number. Relative to this , we can define various notions of size:

- A nonstandard number is said to be
*of polynomial size*if one has for some standard . - A nonstandard number is said to be
*of polylogarithmic size*if one has for some standard . - A nonstandard number is said to be
*of quasipolynomial size*if one has for some standard . - A nonstandard number is said to be
*quasiexponentially small*if one has for every standard . - Given two nonstandard numbers with non-negative, we write or if for some standard . We write or if we have for all standard .

As a general rule of thumb, in our analysis all exponents will be of polylogarithmic size, all coefficients will be of quasipolynomial size, and all error terms will be quasiexponentially small.

In this nonstandard analysis setting, there is a clean calculus (analogous to the calculus of the asymptotic notations and ) to manipulate these sorts of quantities without having to explicitly track the constants . For instance:

- The sum, product, or difference of two quantities of a given size (polynomial, polylogarithmic, quasipolynomial, or quasiexponentially small) remains of that given size (i.e. each size range forms a ring).
- If , and is of a given size, then is also of that size.
- If is of quasipolynomial size and is of polylogarithmic size, then is of quasipolynomial size, and (if is a natural number) is also of quasipolynomial size.
- If is quasiexponentially small, and is of quasipolynomial size, then is also quasiexponentially small. (Thus, the quasiexponentially small numbers form an ideal in the ring of quasipolynomial numbers.)
- Any quantity of polylogarithmic size, is of polynomial size; and any quantity of polynomial size, is of quasipolynomial size.

We will refer to these sorts of facts as *asymptotic calculus*, and rely upon them heavily to simplify a lot of computations (particularly regarding error terms).

Proposition 6 is then equivalent to the following assertion:

Proposition 7 (Nonstandard weak special case of Baker)Let be an unbounded nonstandard natural number, and let be a rational of height at most (i.e. ). Then is not quasiexponentially small (relative to , of course).

Let us quickly see why Proposition 7 implies Proposition 6 (the converse is easy and is left to the reader). This is the usual “compactness and contradiction” argument. Suppose for contradiction that Proposition 6 failed. Carefully negating the quantifiers, we may then find a sequence of (standard) rationals with , such that

for all natural numbers . As is irrational, must go to infinity. Taking the ultralimit of the , and setting to be (say) , we contradict Proposition 7.

It remains to prove Proposition 7. We fix the unbounded nonstandard natural number , and assume for contradiction that is quasiexponentially close to a nonstandard rational of height at most . We will write for the assertion that is quasiexponentially small, thus

The objective is to show that (2) leads to a contradiction.

** — 2. The polynomial method — **

Now it is time to introduce the polynomial method. We will be working with the following class of polynomials:

Definition 8Agood polynomialis a nonstandard polynomial of the formof two nonstandard variables of some (nonstandard) degree at most (in each variable), where is a nonstandard natural number of polylogarithmic size, and whose coefficients are (nonstandard) integers of quasipolynomial size. (A technical point: we require the to depend in an internal fashion on the indices , in order for the nonstandard summation here to be well-defined.) Define the

heightof the polynomial to be the maximum magnitude of the coefficients in ; thus, by hypothesis, is of quasipolynomial size.

We have a key definition:

Definition 9Let be two (nonstandard) positive numbers of polylogarithmic size. A good polynomial is said tonearly vanish to orderon if one has

The derivatives in (4) can be easily computed. Indeed, if we expand out the good polynomial out as (3), then the left-hand side of (4) is

Now, from (2) we have

Using the asymptotic calculus (and the hypotheses that are of polylogarithmic size, and the are of quasipolynomial size) we conclude that the left-hand side of (4) is

The quantity (and its reciprocal) is of quasipolynomial size. Thus, the condition (4) is equivalent to the assertion that

for all and ; as the left-hand side is a nonstandard integer, we see from the integrality gap that the condition is in fact equivalent to the *exact* constraint

Using this reformulation of (4), we can now give some upper and lower bounds on the complexity of good polynomials that nearly vanish to a high order on a set . We first give an lower bound, that prevents the degree from being smaller than :

Proposition 10 (Lower bound)Let be a non-trivial good polynomial of degree that nearly vanishes to order at least on . Then .

*Proof:* Suppose for contradiction that . Then from (6) we have

for ; thus there is a non-trivial linear dependence between the (nonstandard) vectors for . But, from the formula for the Vandermonde determinant, this would imply that two of the are equal, which is absurd.

In the converse direction, we can obtain polynomials that vanish to a high order on , but with degree larger than :

Proposition 11 (Upper bound)Let be positive quantities of polylogarithmic size such thatThen there exists a non-trivial good polynomial of degree at most that vanishes to order on . Furthermore, has height at most

*Proof:* We use the pigeonholing argument of Thue and Siegel. Let be an positive quantity of quasipolynomial size to be chosen later, and choose coefficients for that are nonstandard natural numbers between and . There are possible ways to make such a selection. For each such selection, we consider the expressions arising as the left-hand side of (6) with and . These expressions are nonstandard integers whose magnitude is bounded by

which by asymptotic calculus simplifies to be bounded by

The number of possible values of these expressions is thus

By the hypothesis and asymptotic calculus, we can make this quantity less than for some of size

In particular, can be taken to be of polylogarithmic size. Thus, by the pigeonhole principle, one can find two choices for the coefficients which give equal values for the expressions in the left-hand side of (6). Subtracting those two choices we obtain the result.

** — 3. The bootstrap — **

At present, there is no contradiction between the lower bound in Proposition 10 and the upper bound in Proposition 11, because there is plenty of room between the two bounds. To bridge the gap between the bounds, we need a bootstrap argument that uses vanishing on one to imply vanishing (to slightly lower order) on a larger . The key bootstrap in this regard is:

Proposition 12 (Bootstrap)Let be unbounded polylogarithmic quantities, such thatLet be a good polynomial of degree at most and height , that nearly vanishes to order on . Then also vanishes to order on for any .

*Proof:* It is convenient to use complex analysis methods. We consider the entire function

thus by (3)

By hypothesis, we have

for all and . We wish to show that

for and . Clearly we may assume that .

Fix and . To estimate , we consider the contour integral

(oriented anticlockwise), where is to be chosen later, and estimate it in two different ways. Firstly, we have

so for , we have the bound

when , which by the hypotheses and asymptotic calculus, simplifies to

Also, when we have

We conclude the upper bound

for the magnitude of (7). On the other hand, we can evaluate (7) using the residue theorem. The integrand has poles at and at . The simple pole at has residue

Now we consider the poles at . For each such , we see that the first derivatives of are quasiexponentially small at . Thus, by Taylor expansion (and asymptotic calculus), one can express as the sum of a polynomial of degree with quasiexponentially small coefficients, plus an entire function that vanishes to order at . The latter term contributes nothing to the residue at , while from the Cauchy integral formula (applied, for instance, to a circle of radius around ) and asymptotic calculus, we see that the former term contributes a residue is quasiexponentially small. In particular, it is less than . We conclude that

We have

and thus

choosing to be a large standard multiple of and using the hypothesis , we can simplify this to

To improve this bound, we use the integrality gap. Recall that from (5) that

in particular, is quasiexponentially close to a (nonstandard) integer. Since

we have

(say). Using the integrality gap, we conclude that

which implies that nearly vanishes to order on , as required.

Now we can finish the proof of Proposition 7 (and hence Proposition 6). We select quantities of polylogarithmic size obeying the bounds

and

with a gap of a positive power of between each such inequality. For instance, one could take

many other choices are possible (and one can optimise these choices eventually to get a good value of exponent in Proposition 6).

Using Proposition 11, we can find a good polynomial which vanishes to order on , of height , and hence (by the assumptions on ) of height .

Applying Proposition 12, nearly vanishes to order on for any . Iterating this, an easy induction shows that for any standard , nearly vanishes to order on for any . As was chosen to be larger than a positive power of , we conclude that nearly vanishes to order at least on for any of polylogarithmic size. But for large enough, this contradicts Proposition 10.

Remark 1The above argument places a lower bound on quantities such asfor integer . Baker’s theorem, in its full generality, gives a lower bound on quantities such as

for algebraic numbers , which is polynomial in the height of the quantities involved, assuming of course that are multiplicatively independent, and that all quantities are of bounded degree. The proof is more intricate than the one given above, but follows a broadly similar strategy, and the constants are completely effective.

## 17 comments

Comments feed for this article

21 August, 2011 at 7:33 pm

Bo JacobyIs a^b defined for a<0, b irrational?

21 August, 2011 at 9:47 pm

Terence TaoAh, fair point; can be multivalued in general (using the complex logarithm and complex exponential to define when a is not a positive real); but the Gelfond-Schneider theorem will apply to each of the interpretations of . I’ve modified the statement of the theorem slightly to reflect this.

22 August, 2011 at 12:01 am

Bo JacobyDoesn’t the general case (when a is neither =0 nor 1) follow easily from the special case a>0?

22 August, 2011 at 10:13 am

Terence TaoI don’t see an easy argument to establish that, no. For instance, one well-known consequence of the (complex) Gelfond-Schneider theorem is the transcendentality of , which is one of the possible values of . I don’t see how to easily establish this fact just from the positive case of that theorem.

24 August, 2011 at 1:28 pm

Bo JacobySorry, I made a typo. “Doesn’t the general case follow easily from the special case a>1?”. (not a>0. That was the typo).

The case 0<a1, and 1/a is algebraic when a is algebraic.

I accept that the case a1 and {\pi} is irrational, because e is not algebraic. The transcendentality of {(-1)^i} does not follow because i is not irrational. Or is it?

24 August, 2011 at 1:42 pm

Bo JacobyThe comment below is unreadable because I use shorthand for less than and greater than. I cannot edit a comment once it is posted.

The case “0 less than a less than 1″, follow because then 1/a is greater than 1, and 1/a is algebraic when a is algebraic.

I accept that the case: “a less than 0″, does not follow from the case: “a greater than 1″, but I do not understand you example.

The transcendentallity of e^pi does not follow because e is not algebraic.

The transcendentality of (-1)^i does not follow because i is not irrational. Or is it?

24 August, 2011 at 2:46 pm

Terence Taoi is not the ratio of two integers, and is hence irrational.

21 August, 2011 at 10:33 pm

Jeremy HurwitzIn the statement of Theorem 1, the last sentence got truncated. It’s currently ends “(Note that”.

[Corrected, thanks - T.]22 August, 2011 at 10:03 am

AnonymousI am a student majored in mathematics in shanghai maritime university.I regard you as a model to adore and study.(i’m sorry that i am poor at english.O(∩_∩)O).Many people said that you were genius,but i hold the view that 100%success equals 99%hardworking and 1% genius.so i think that you are not alone,there are many mathematics explorers in the world who havn’t been found.

Good Luck!!

23 August, 2011 at 9:43 am

EvgeniyIt is a very interesting post (as much of your others posts are)!

A minor typo (as I see it): aq+bp should be replaced by ap+bq in the formula (5), the formula preceding (5) and two formulae after (5) (in (6), in particular).

[Corrected, thanks - T.]24 August, 2011 at 12:32 pm

Bo JacobyYou wrote: the gap between powers of two {3^p} and powers of three {2^q}. You ment to write: the gap between powers of three {3^p} and powers of two {2^q}.

[Corrected, thanks - T]25 August, 2011 at 9:15 am

The Collatz conjecture, Littlewood-Offord theory, and powers of 2 and 3 « What’s new[...] need a lower bound on . Here, we will use Baker’s theorem (as discussed in this previous post), which among other things gives the lower [...]

20 September, 2011 at 2:48 am

Diophantine sets and the integers | cartesian product[...] Hilbert’s seventh problem, and powers of 2 and 3 (terrytao.wordpress.com) LD_AddCustomAttr("AdOpt", "1"); LD_AddCustomAttr("Origin", "other"); LD_AddCustomAttr("theme_bg", "ffffff"); LD_AddCustomAttr("theme_text", "333333"); LD_AddCustomAttr("theme_link", "0066cc"); LD_AddCustomAttr("theme_border", "f2f7fc"); LD_AddCustomAttr("theme_url", "ff4b33"); LD_AddCustomAttr("LangId", "1"); LD_AddCustomAttr("Tag", "diophantine-equation"); LD_AddCustomAttr("Tag", "hilberts-problems"); LD_AddCustomAttr("Tag", "hilberts-tenth-problem"); LD_AddSlot("LD_ROS_300-WEB"); LD_GetBids(); Rate this: Share this:TwitterPrintFacebookLinkedInEmailDiggRedditStumbleUponLike this:LikeBe the first to like this post. [...]

20 September, 2011 at 8:06 am

More on Hilbert’s tenth problem | cartesian product[...] Hilbert’s seventh problem, and powers of 2 and 3 (terrytao.wordpress.com) [...]

6 October, 2011 at 4:56 pm

mathsetc[...] Hilbert’s seventh problem, and powers of 2 and 3 (terrytao.wordpress.com) [...]

21 August, 2012 at 10:07 am

Eleventh Linkfest[...] Terence Tao: A geometric proof of the impossibility of angle trisection by straightedge and compass, Hilbert’s seventh problem, and powers of 2 and 3 [...]

25 October, 2013 at 7:48 am

Algebraic combinatorial geometry: the polynomial method in arithmetic combinatorics, incidence combinatorics, and number theory | What's new[…] used independently many times in mathematics; for instance, it plays a key role in the proof of Baker’s theorem in transcendence theory, or Stepanov’s method in giving an elementary proof of the Riemann hypothesis for finite […]