Let be the algebraic closure of , that is to say the field of algebraic numbers. We fix an embedding of into , giving rise to a complex absolute value for algebraic numbers .

Let be of degree , so that is irrational. A classical theorem of Liouville gives the quantitative bound

for the irrationality of fails to be approximated by rational numbers , where depends on but not on . Indeed, if one lets be the Galois conjugates of , then the quantity is a non-zero natural number divided by a constant, and so we have the trivial lower bound

from which the bound (1) easily follows. A well known corollary of the bound (1) is that Liouville numbers are automatically transcendental.

The famous theorem of Thue, Siegel and Roth improves the bound (1) to

for any and rationals , where depends on but not on . Apart from the in the exponent and the implied constant, this bound is optimal, as can be seen from Dirichlet’s theorem. This theorem is a good example of the *ineffectivity phenomenon* that affects a large portion of modern number theory: the implied constant in the notation is known to be finite, but there is no explicit bound for it in terms of the coefficients of the polynomial defining (in contrast to (1), for which an effective bound may be easily established). This is ultimately due to the reliance on the “dueling conspiracy” (or “repulsion phenomenon”) strategy. We do not as yet have a good way to rule out *one* counterexample to (2), in which is far closer to than ; however we can rule out *two* such counterexamples, by playing them off of each other.

A powerful strengthening of the Thue-Siegel-Roth theorem is given by the *subspace theorem*, first proven by Schmidt and then generalised further by several authors. To motivate the theorem, first observe that the Thue-Siegel-Roth theorem may be rephrased as a bound of the form

for any algebraic numbers with and linearly independent (over the algebraic numbers), and any and , with the exception when or are rationally dependent (i.e. one is a rational multiple of the other), in which case one has to remove some lines (i.e. subspaces in ) of rational slope from the space of pairs to which the bound (3) does not apply (namely, those lines for which the left-hand side vanishes). Here can depend on but not on . More generally, we have

Theorem 1 (Schmidt subspace theorem)Let be a natural number. Let be linearly independent linear forms. Then for any , one has the boundfor all , outside of a finite number of proper subspaces of , where

and depends on and the , but is independent of .

Being a generalisation of the Thue-Siegel-Roth theorem, it is unsurprising that the known proofs of the subspace theorem are also ineffective with regards to the constant . (However, the number of exceptional subspaces may be bounded effectively; cf. the situation with the Skolem-Mahler-Lech theorem, discussed in this previous blog post.) Once again, the lower bound here is basically sharp except for the factor and the implied constant: given any with , a simple volume packing argument (the same one used to prove the Dirichlet approximation theorem) shows that for any sufficiently large , one can find integers , not all zero, such that

for all . Thus one can get comparable to in many different ways.

There are important generalisations of the subspace theorem to other number fields than the rationals (and to other valuations than the Archimedean valuation ); we will develop one such generalisation below.

The subspace theorem is one of many *finiteness theorems* in Diophantine geometry; in this case, it is the number of exceptional subspaces which is finite. It turns out that finiteness theorems are very compatible with the language of nonstandard analysis. (See this previous blog post for a review of the basics of nonstandard analysis, and in particular for the nonstandard interpretation of asymptotic notation such as and .) The reason for this is that a standard set is finite if and only if it contains no strictly nonstandard elements (that is to say, elements of ). This makes for a clean formulation of finiteness theorems in the nonstandard setting. For instance, the standard form of Bezout’s theorem asserts that if are coprime polynomials over some field, then the curves and intersect in only finitely many points. The nonstandard version of this is then

Theorem 2 (Bezout’s theorem, nonstandard form)Let be standard coprime polynomials. Then there are no strictly nonstandard solutions to .

Now we reformulate Theorem 1 in nonstandard language. We need a definition:

Definition 3 (General position)Let be nested fields. A point in is said to be in-general positionif it is not contained in any hyperplane of definable over , or equivalently if one hasfor any .

Theorem 4 (Schmidt subspace theorem, nonstandard version)Let be a standard natural number. Let be linearly independent standard linear forms. Let be a tuple of nonstandard integers which is in -general position (in particular, this forces to be strictly nonstandard). Then one haswhere we extend from to (and also similarly extend from to ) in the usual fashion.

Observe that (as is usual when translating to nonstandard analysis) some of the epsilons and quantifiers that are present in the standard version become hidden in the nonstandard framework, being moved inside concepts such as “strictly nonstandard” or “general position”. We remark that as is in -general position, it is also in -general position (as an easy Galois-theoretic argument shows), and the requirement that the are linearly independent is thus equivalent to being -linearly independent.

Exercise 1Verify that Theorem 1 and Theorem 4 are equivalent. (Hint:there are only countably many proper subspaces of .)

We will not prove the subspace theorem here, but instead focus on a particular application of the subspace theorem, namely to counting integer points on curves. In this paper of Corvaja and Zannier, the subspace theorem was used to give a new proof of the following basic result of Siegel:

Theorem 5 (Siegel’s theorem on integer points)Let be an irreducible polynomial of two variables, such that the affine plane curve either has genus at least one, or has at least three points on the line at infinity, or both. Then has only finitely many integer points .

This is a finiteness theorem, and as such may be easily converted to a nonstandard form:

Theorem 6 (Siegel’s theorem, nonstandard form)Let be a standard irreducible polynomial of two variables, such that the affine plane curve either has genus at least one, or has at least three points on the line at infinity, or both. Then does not contain any strictly nonstandard integer points .

Note that Siegel’s theorem can fail for genus zero curves that only meet the line at infinity at just one or two points; the key examples here are the graphs for a polynomial , and the Pell equation curves . Siegel’s theorem can be compared with the more difficult theorem of Faltings, which establishes finiteness of rational points (not just integer points), but now needs the stricter requirement that the curve has genus at least two (to avoid the additional counterexample of elliptic curves of positive rank, which have infinitely many rational points).

The standard proofs of Siegel’s theorem rely on a combination of the Thue-Siegel-Roth theorem and a number of results on abelian varieties (notably the Mordell-Weil theorem). The Corvaja-Zannier argument rebalances the difficulty of the argument by replacing the Thue-Siegel-Roth theorem by the more powerful subspace theorem (in fact, they need one of the stronger versions of this theorem alluded to earlier), while greatly reducing the reliance on results on abelian varieties. Indeed, for curves with three or more points at infinity, no theory from abelian varieties is needed at all, while for the remaining cases, one mainly needs the existence of the Abel-Jacobi embedding, together with a relatively elementary theorem of Chevalley-Weil which is used in the proof of the Mordell-Weil theorem, but is significantly easier to prove.

The Corvaja-Zannier argument (together with several further applications of the subspace theorem) is presented nicely in this Bourbaki expose of Bilu. To establish the theorem in full generality requires a certain amount of algebraic number theory machinery, such as the theory of valuations on number fields, or of relative discriminants between such number fields. However, the basic ideas can be presented without much of this machinery by focusing on simple special cases of Siegel’s theorem. For instance, we can handle irreducible cubics that meet the line at infinity at exactly three points :

Theorem 7 (Siegel’s theorem with three points at infinity)Siegel’s theorem holds when the irreducible polynomial takes the formfor some quadratic polynomial and some distinct algebraic numbers .

*Proof:* We use the nonstandard formalism. Suppose for sake of contradiction that we can find a strictly nonstandard integer point on a curve of the indicated form. As this point is infinitesimally close to the line at infinity, must be infinitesimally close to one of ; without loss of generality we may assume that is infinitesimally close to .

We now use a version of the polynomial method, to find some polynomials of controlled degree that vanish to high order on the “arm” of the cubic curve that asymptotes to . More precisely, let be a large integer (actually will already suffice here), and consider the -vector space of polynomials of degree at most , and of degree at most in the variable; this space has dimension . Also, as one traverses the arm of , any polynomial in grows at a rate of at most , that is to say has a pole of order at most at the point at infinity . By performing Laurent expansions around this point (which is a non-singular point of , as the are assumed to be distinct), we may thus find a basis of , with the property that has a pole of order at most at for each .

From the control of the pole at , we have

for all . The exponents here become negative for , and on multiplying them all together we see that

This exponent is negative for large enough (or just take ). If we expand

for some algebraic numbers , then we thus have

for some standard . Note that the -dimensional vectors are linearly independent in , because the are linearly independent in . Applying the Schmidt subspace theorem in the contrapositive, we conclude that the -tuple is not in -general position. That is to say, one has a non-trivial constraint of the form

for some standard rational coefficients , not all zero. But, as is irreducible and cubic in , it has no common factor with the standard polynomial , so by Bezout’s theorem (Theorem 2) the constraint (4) only has standard solutions, contradicting the strictly nonstandard nature of .

Exercise 2Rewrite the above argument so that it makes no reference to nonstandard analysis. (In this case, the rewriting is quite straightforward; however, there will be a subsequent argument in which the standard version is significantly messier than the nonstandard counterpart, which is the reason why I am working with the nonstandard formalism in this blog post.)

A similar argument works for higher degree curves that meet the line at infinity in three or more points, though if the curve has singularities at infinity then it becomes convenient to rely on the Riemann-Roch theorem to control the dimension of the analogue of the space . Note that when there are only two or fewer points at infinity, though, one cannot get the negative exponent of needed to usefully apply the subspace theorem. To deal with this case we require some additional tricks. For simplicity we focus on the case of Mordell curves, although it will be convenient to work with more general number fields than the rationals:

Theorem 8 (Siegel’s theorem for Mordell curves)Let be a non-zero integer. Then there are only finitely many integer solutions to . More generally, for any number field , and any nonzero , there are only finitely many algebraic integer solutions to , where is the ring of algebraic integers in .

Again, we will establish the nonstandard version. We need some additional notation:

Definition 9We define an almost rational integerto be a nonstandard such that for some standard positive integer , and write for the -algebra of almost rational integers.If is a standard number field, we define an almost -integerto be a nonstandard such that for some standard positive integer , and write for the -algebra of almost -integers.We define an almost algebraic integerto be a nonstandard such that is a nonstandard algebraic integer for some standard positive integer , and write for the -algebra of almost algebraic integers.

Theorem 10 (Siegel for Mordell, nonstandard version)Let be a non-zero standard algebraic number. Then the curve does not contain any strictly nonstandard almost algebraic integer point.

Another way of phrasing this theorem is that if are strictly nonstandard almost algebraic integers, then is either strictly nonstandard or zero.

Exercise 3Verify that Theorem 8 and Theorem 10 are equivalent.

Due to all the ineffectivity, our proof does not supply any bound on the solutions in terms of , even if one removes all references to nonstandard analysis. It is a conjecture of Hall (a special case of the notorious ABC conjecture) that one has the bound for all (or equivalently ), but even the weaker conjecture that are of polynomial size in is open. (The best known bounds are of exponential nature, and are proven using a version of Baker’s method: see for instance this text of Sprindzuk.)

A direct repetition of the arguments used to prove Theorem 7 will not work here, because the Mordell curve only hits the line at infinity at one point, . To get around this we will exploit the fact that the Mordell curve is an elliptic curve and thus has a group law on it. We will then divide all the integer points on this curve by two; as elliptic curves have four 2-torsion points, this will end up placing us in a situation like Theorem 7, with four points at infinity. However, there is an obstruction: it is not obvious that dividing an integer point on the Mordell curve by two will produce another integer point. However, this is essentially true (after enlarging the ring of integers slightly) thanks to a general principle of Chevalley and Weil, which can be worked out explicitly in the case of division by two on Mordell curves by relatively elementary means (relying mostly on unique factorisation of ideals of algebraic integers). We give the details below the fold.

** — 1. Dividing by two on the Mordell curve — **

This section will be elementary (in the sense that the arguments may be phrased in the first-order language of rings), and so it will make no difference whether we are working in the standard or nonstandard formalism. As such, we make no reference to nonstandard analysis here.

As is well known, any elliptic curve (after adjoining a point at infinity) has a group law , with whenever are distinct collinear points on , or if the tangent line at meets at another point . In the case of the Mordell curve , the double of a point is given by the formulae

as long as . (For , the double will be the point at infinity.) Geometrically, is the slope of the tangent line to at .

Over the complex numbers, the constraint removes three points from , where is the standard cube root of unity. This thrice punctured curve is no longer an affine plane curve, but one can make it affine again by incorporating as another coordinate, giving rise to the lifted curve

and the doubling map lifts to the polynomial map defined by

Over the complex numbers, it is well known that the elliptic curve is isomorphic as a group to the torus , which implies that the map is four-to-one. Now we consider the inversion problem explicitly: given a point , find a point such that

and

We would also like to avoid using division as much as possible, since in our application will be an integer point, and we would like to also be “close” to being integers too (in a sense to be made precise later). We can rearrange the above two equations as

and so we see that to find , it suffices to find , and furthermore if are “close” to integers, then are close to integers also (up to a single division by two). One can also check from a routine computation that if (6), (7), (5) hold, then we have , so that indeed lies in .

If one inserts (6), (7) into (5), one arrives at the quartic equation

One could solve this equation by the general quartic formula, but actually the formula simplifies substantially for this specific quartic equation, and can be obtained as follows. The Mordell equation can be rewritten as

We can take square roots and then write

where

As each is determined up to sign, and is fixed, there are exactly four choices for here (even when vanishes). For any such choice, if one sets

then on squaring and rearranging we have

and on squaring again

since

one soon sees that (8) holds. Thus we have found the four solutions to the equation by choosing the square roots of , and then defining by (9) and by (6), (7). In particular, if are (rational) integers, then are algebraic integers, and are algebraic integers divided by two.

** — 2. The Chevalley-Weil principle — **

The *Chevalley-Weil principle* asserts, roughly speaking, that if one has a “finite cover” of one affine variety by another , with the varieties and covering maps defined over the rationals, then every integer point in lifts to a “near”-integer point of . (There is also a version for projective varieties, which I will not discuss here; one can also generalise integers and rationals to other number fields.) To make the notion of “finite cover” precise, one needs the notion of an etale morphism; to make the notion of “near”-integer precise, we can use the notion of almost algebraic integer. We will not formalise the principle in full generality here; see for instance this text of Lang, this text of Hindry-Silverman, this text of Serre, or this text of Bombieri and Gubler, for a precise statement.

Here, we will focus on simple special cases of the Chevalley-Weil principle, in which one can work “by hand” using quite classical methods from algebraic number theory. We begin with a very elementary example, which in some sense goes all the way back to Diophantus himself:

Theorem 11 (Baby case of Chevalley-Weil)Let be a non-zero integer, and let be such that . Then are “nearly perfect squares” in the sense that one has , for some integers with dividing .

To interpret this result as a special case of the Chevalley-Weil principle, we take to be the affine plane curve , to be the affine curve , with covering map ; thus is a double cover of , and the theorem is asserting that integer points in are covered by near-integer points in (up to square roots of factors of ).

By transfer, the above theorem also applies in the setting where are non-standard integers and remains standard; in that case, now become nonstandard also, but remain standard. In particular, and become almost algebraic integers.

*Proof:* We can assume that , and hence and , are non-zero, since the case is easily verified. We use the fundamental theorem of arithmetic to factor into primes for some distinct primes and positive natural numbers , thus

On the other hand, the greatest common divisor of and is a divisor of . From this, we see that for each prime not dividing , will divide one of exactly times, and not divide the other, whereas for primes dividing , will either divide each of an even number of times, or an odd number of times. Collecting terms (including the units in the prime factorisations of ) to form , the claim follows.

Remark 1One can reformulate the above argument by using the -adic valuations instead of the fundamental theorem of arithmetic. That formulation is more convenient when proving the Chevalley-Weil theorem in full generality.

Now we give a variant of the above theorem which is of relevance for our application. It will be convenient to phrase the variant in the nonstandard setting.

Theorem 12 (Toddler case of Chevalley-Weil)Let be a non-zero standard algebraic number, and let be a pair of almost algebraic integers such that . Then we can write , , , where are almost algebraic integers.

This is a case of the Chevalley-Weil principle with , , and covering map . The claim is then that any pre-image of an almost algebraic integer point is again an almost algebraic integer point.

*Proof:* We use essentially the same argument as the previous theorem, but in the language of ideals rather than numbers. The case is easy, so we assume that . By choosing a suitably large standard number field , we may assume that are nonstandard algebraic integers in for some standard . As is well known, the ring of algebraic integers is a Dedekind domain, so that one has unique factorisation of ideals. In particular, the nonstandard principal ideal , which is an ideal of , factorises as a nonstandard product

for some distinct nonstandard prime ideals (not necessarily principal), a nonstandard natural number , and positive nonstandard natural numbers . Since , we conclude in particular that

If (say) and are both divisible by a prime ideal , then is also; but the latter ideal is standard, and so such are standard, and furthermore the number of such is bounded. Similarly for other pairs from . Thus for all other , the divide exactly one of an even number of times, and do not divide the other two ideals. This implies a factorisation of the form

where are nonstandard ideals and are standard ideals. From class field theory, the class group of is finite, which implies that any nonstandard ideal can be expressed as the product of a nonstandard principal ideal and a standard fractional ideal. Thus we may write

where are now standard fractional ideals. However, the ratio of two principal ideals is a principal fractional ideal, and thus , , for some standard . In other words, we have

for some nonstandard units . But from

\begin

Let be a standard non-zero algebraic number, and let be the affine curve

Then does not contain any strictly nonstandard almost algebraic integer point .

Indeed, Theorem 12 and the calculations from the previous section reveal that if the curve contains a strictly nonstandard almost algebraic integer point, then the lifted curve does also.

The argument from Theorem 7 can be adapted without much difficulty to show that does not contain any strictly nonstandard point in . However to work with almost algebraic integers instead of nonstandard integers, we need to extend the subspace theorem to this setting, a topic to which we now turn.

** — 3. The subspace theorem in number fields — **

If is a nonstandard almost rational integer which is non-zero, then one clearly has ; this is (up to the exponent) the case of the subspace theorem, Theorem 4.

Now suppose that is a non-zero nonstandard almost algebraic integer for some standard number field . Then it is no longer necessarily the case that ; for instance, this would imply that

for any nonstandard integers , which contradicts the Dirichlet approximation theorem. However, if is a Galois extension of , and is the Galois group, then it is still true that

since is a non-zero element of .

In a similar spirit, we claim the following generalisation of the subspace theorem to number fields. If is an algebraic number of degree , we write

for the norm at , where are the Galois conjugates of , and for , we write

Clearly this norm is Galois-invariant.

Theorem 13 (Schmidt subspace theorem for number fields)Let be a standard natural number, and let be a standard Galois extension of , with Galois group . For each , let be linearly independent standard linear forms. Let be in -general position. Then one haswhere acts on (and hence on ) in the obvious fashion.

Exercise 4State a logically equivalent standard form of Theorem 13 (analogous to Theorem 1), and prove the logical equivalence.

Theorem 13 (in the logically equivalent standard form) was proven by Schlickewei (who also handled more general valuations than the complex absolute value , and also allowed the coefficients of to be -integers rather than algebraic integers). However, Theorem 13 can also be deduced from using Theorem 1 as a black box, as we shall shortly demonstrate.

Let us see how Theorem 13 implies Theorem 2.

*Proof:* (of Theorem 2 assuming Theorem 13). This is a modification of the argument of Theorem 7. Suppose for contradiction that we can find a non-zero algebraic number such that the curve

contains a strictly nonstandard almost algebraic integer point . We may place inside for some number field , which we may take to be Galois by enlarging as necessary. Let be the Galois group.

For each , the points are either bounded (i.e., ) or unbounded. If is bounded for all then the coefficients of are all bounded, so that is standard, a contradiction. Thus is unbounded for at least one .

Let be a large integer to be chosen later (actually will do here), and consider the vector space of polynomials spanned by the monomials

and

This space thus has dimension . We claim that none of the non-zero polynomials in vanish identically on . To see this, we divide into two cases. If contains at least one monomial with a non-trivial power of , then we can write for some , where is a non-zero polynomial in of degree at most and has degree less than in . By inspecting the limits , , of , we see that if vanished on , then the non-zero polynomial of degree at most would have to vanish at all three points , a contradiction. Finally, if has no terms involving , then it is a polynomial in of degree at most in , and thus not divisible by , and the claim now follows from Bezout’s theorem.

We claim that for every , we may find a basis for with the property that

for some standard if is unbounded (which, as previously observed, must hold at least once). If this claim held, then on multiplying these bounds together and applying Theorem 13, we conclude that the tuple

is not in -general position, which implies that for some non-zero . But does not vanish identically on the irreducible curve , so by Bezout’s theorem can only vanish at standard points of , a contradiction.

It remains to prove the claim. To simplify the notation we just handle the case when is the identity , as the other cases are treated similarly. If is bounded, then the claim is trivial (taking to be an arbitrary basis of , e.g. the monomial basis), so assume that is unbounded. From the geometry of , we see that this can only occur if either is unbounded and is infinitesimally close to zero, or else if is unbounded, is infinitesimally close to zero, and is infinitesimally close to one of , , or .

First suppose that is unbounded. Then as , one can perform a Puiseux series expansion of any polynomial in powers of , with the highest power being . As such, we may find a basis of for which

for , which gives (10) for large enough.

Now suppose that it is which is unbounded. Then as , , and converges to one of , one can perform a Laurent series expansion of any polynomial in in powers of , with the highest power being . Thus we may find a basis of for which

for , and again (10) follows for large enough. This concludes the proof of Theorem 7.

** — 4. Amplifying the subspace theorem — **

Now we show how Theorem 4 may be amplified to prove Theorem 13. Our arguments will use very little number theory, relying primarily on linear algebra and the nonstandard analysis framework. (As always, one can reformulate these arguments in the standard setting, but the linear algebra arguments become quite complicated, requiring the use of many rank reduction arguments, as per Section 2 of this previous blog post.

Let be as in Theorem 13. The space is a -dimensional vector space over , and so we may (non-canonically) identify with , and with . In particular, can be viewed as a -tuple of almost rational integers. It would be helpful if we knew that this tuple was in -general position, as one could then apply Theorem 4 directly to conclude. However, our hypothesis is only the weaker assertion that is in -general position. Nevertheless, we may place in a certain “row echelon form” in terms of -general position parameters, and we will then be able to deduce Theorem 13 from averaging together various applications of Theorem 4, with the precise applications to use being uncovered through linear algebra.

We turn to the details. Recall that if is an -dimensional vector space over a field , then a flag in is a nested sequence of spaces

such that each is a -linear subspace of dimension .

Theorem 14 (Standard row echelon form)Let be a -dimensional field extension of a field , let be a -dimensional -vector space, and let be an -linear subspace of (viewed as an -dimensional vector space over ). Suppose also that the -dimension of is at least for every non-zero -linear functional and some . Then one may find a flagof -subspaces of , as well as natural numbers

such that has -dimension for .

*Proof:* The case is trivial, so assume inductively that and that the claim has already been proven for . Let be a non-zero -linear functional which minimises the -dimension of the space ; if we denote the dimension of this space by , then . We set , and let be the kernel of , then is a -hyperplane in . We claim that the -dimension of is at least for any non-zero , for if this were not the case, one could find an extension of to which annihilated a section of in , so that , contradicting the minimality of . We may thus apply the induction hypothesis to find a flag

and natural numbers

such that such that has -dimension for . The claim follows.

Corollary 15Let be a degree extension of , and let be in -general position. Then one find a -subspace of such that is an element of the module , where is an arbitrary -basis of . Furthermore, is a -generic point of , in the sense that for any -basis of , one has with in -general position. (Equivalently, does not lie in for any proper -subspace of .)Furthermore, one may find linearly independent -linear forms and natural numbers

such that for each , the space

This corollary is a sort of “regularity lemma” for points in ; this is perhaps the component of the argument which is the most difficult to convert into a standard setting.

*Proof:* Let denote the intersection of all the -subspaces of such that . Then is itself a -subspace of with ; one can think of as the -linear analogue of a “Zariski closure” of . If is a basis of , then we have for some ; if is not in -general position, then we have a non-trivial dependence for some , not all zero, and we may use this to place in for some proper -subspace of , a contradiction. Thus is a -generic point in .

Since is in -general position, is non-trivial for any -linear form . From Theorem 14 we may then find a flag

of -subspaces of , as well as natural numbers

such that has -dimension for . If, for each , we let be a -linear form that annihilates but does not annihilate , we obtain the claim.

Next, we need a classical lemma on intersection of flags.

Lemma 16 (Schubert cell decomposition)Letand

be two flags on the same -dimensional vector space over a field . Then one can find a basis of and a permutation such that for all .

*Proof:* For any , the function is non-decreasing in , is equal to when , and when , thus there exists a unique such that this function equals for and for . By inspecting the quantities for a given , which similarly increase from to , we see that there is at most one with , thus is a permutation. If we then choose to be an element of that lies outside of , we obtain the claim.

Let be as in Theorem 13, and let and be as in Corollary 15, with being the degree of . By permuting the , we may assume that is non-decreasing in .

For each , we apply Lemma 16 to the flags

and

(with the span being over , and with the extened from -linear forms to -linear forms) we may find a -basis of the -linear forms on and a permutation such that for each , is a -linear combination of , and is also a -linear combination of . From the latter claim, the non-decreasing nature of , and the triangle inequality we have

It will now suffice to show that

We will prove the inequalities

for all ; since the standard quantities are finite and non-increasing in , the claim (12) follows from taking a telescoping product of powers of (13).

To prove (13), we need a classical lemma concerning the canonical embedding of into via the Galois group action.

*Proof:* We give a nonstandard analysis proof. If this were not the case, then we would have coefficients for , not all zero, such that

for all , and hence also for all . Now let be nonstandard real numbers for such that , and such that no two are comparable. From the Dirichlet approximation theorem argument given at the start of this post, we can find a non-zero such that . On the other hand, the norm is a non-zero rational integer and so . Combining the two estimates, we conclude that for all . But as the are incomparable, this forces the to be linearly independent over , a contradiction.

Corollary 18If is a -subspace of of dimension , then there exist distinct elements of such that the -linear functions for are -linearly independent on .

*Proof:* Let be a -basis for , which we complete to a -basis for . By Lemma 17, the vectors for span as a -linear space, and are thus -linearly independent. In particular, the matrix has full rank, and so has a minor which is non-singular, giving the claim.

Now we can prove (13) for a given . For each , we apply Corollary 18 to the space defined in (11) to find distinct elements of such that the forms for are -linearly independent on ; applying an arbitrary element of , we also see that the forms for are -linearly independent on . We will show that

taking the geometric mean over all , we obtain (13).

It remains to prove (14). By construction of the , we know that the is a linear combination of the tuple

This tuple lies in , where is the image of under the map . By Corollary 15, is a -vector space of dimension , and is a -generic point of . By Theorem 4 (and the subsequent remarks), we will be able to conclude (14) as soon as we show that the expressions for and are -linearly independent. Suppose for contradiction that one has a non-trivial linear dependence

for some , not all zero. Let be the largest value of for which there is a non-zero value of . Each is a -linear combination of , with the coefficient of being non-zero (otherwise the could not be linearly independent in ). We conclude that there is a non-trivial -linear combination of the for , which is equal to a -linear combination of the for and . But as is a -generic point in , this can only happen if the corresponding linear combination of the forms and vanishes on (with the now interpreted as coordinate functions on ). This implies that the forms are -linearly dependent on , giving the required contradiction. This proves (14), and Theorem 13 follows.

** — 5. The flag structure of the magnitude of linear forms — **

Although we will not need this fact here, it is interesting to note that the subspace theorem (Theorem 4) gives rather precise control on the size of various linear combinations of algebraic numbers.

Theorem 19 (Schmidt subspace theorem, flag version)Let be a standard natural number, and let be in -general position. Then one has a standard flagin the space of linear forms , as well as standard real numbers

For instance, if are linearly independent over , then Theorem 19 asserts that

for all outside of a one-dimensional subspace of , and we have

for all non-zero and some . One can use Theorem 13 obtain a similar description of the magnitude of -linear forms (and their Galois conjugates) of a point in -general position, but we will not do so here.

*Proof:* For each standard , the space

is a -linear subspace of , which is non-decreasing and right-continuous in , and equals all of when . As subspaces of must have an integer dimension between and , we may thus find a complete flag

and extended real numbers

such that whenever and . Selecting to be an element of for and applying Theorem 4, we obtain (15), which in particular implies that the are finite, at which point (16) follows from the definition of the . Finally, by taking the coordinate linear forms we see that for at least one , which forces .

Exercise 5Try to state and prove a standard version of Theorem 19. (This is remarkably tricky – it takes the form of a “regularity lemma” in the spirit of Section 2 of this blog post – and may help to illustrate the power of the nonstandard formalism.)

## 4 comments

Comments feed for this article

7 July, 2014 at 9:00 am

Francisco Javier ThayerRe: Bezout. Referring to previous posts on Loeb measure, if then the probability that are relatively prime is . This is elementary (Hardy and Wright) and Loeb measure is merely a measure-theoretic interpretation of this. There are analogous results for square-free integers etc. I wonder if there is some way of doing something similar with polynomials?

15 July, 2014 at 4:19 pm

Colin RustAs written, I think the proof of the theorem of Liouville in the second paragraph is only literally valid for algebraic integers not general irrational algebraic numbers (in which case the product is not guaranteed to be a natural number).

[Corrected, thanks – T.]12 September, 2017 at 2:12 am

aIs there an analog of subspace theorem for polynomial forms rather than just linear forms?

29 May, 2020 at 9:54 am

complexsesameYes there is, essentially done by Corvaja-Zannier’s paper “On general Thue’s equation”. And there are a lot of further generalizations to projective varieties. But they are mostly limited by somehow reduce to the linear case.