Let be the algebraic closure of
, that is to say the field of algebraic numbers. We fix an embedding of
into
, giving rise to a complex absolute value
for algebraic numbers
.
Let be of degree
, so that
is irrational. A classical theorem of Liouville gives the quantitative bound
for the irrationality of fails to be approximated by rational numbers
, where
depends on
but not on
. Indeed, if one lets
be the Galois conjugates of
, then the quantity
is a non-zero natural number divided by a constant, and so we have the trivial lower bound
from which the bound (1) easily follows. A well known corollary of the bound (1) is that Liouville numbers are automatically transcendental.
The famous theorem of Thue, Siegel and Roth improves the bound (1) to
for any and rationals
, where
depends on
but not on
. Apart from the
in the exponent and the implied constant, this bound is optimal, as can be seen from Dirichlet’s theorem. This theorem is a good example of the ineffectivity phenomenon that affects a large portion of modern number theory: the implied constant in the
notation is known to be finite, but there is no explicit bound for it in terms of the coefficients of the polynomial defining
(in contrast to (1), for which an effective bound may be easily established). This is ultimately due to the reliance on the “dueling conspiracy” (or “repulsion phenomenon”) strategy. We do not as yet have a good way to rule out one counterexample to (2), in which
is far closer to
than
; however we can rule out two such counterexamples, by playing them off of each other.
A powerful strengthening of the Thue-Siegel-Roth theorem is given by the subspace theorem, first proven by Schmidt and then generalised further by several authors. To motivate the theorem, first observe that the Thue-Siegel-Roth theorem may be rephrased as a bound of the form
for any algebraic numbers with
and
linearly independent (over the algebraic numbers), and any
and
, with the exception when
or
are rationally dependent (i.e. one is a rational multiple of the other), in which case one has to remove some lines (i.e. subspaces in
) of rational slope from the space
of pairs
to which the bound (3) does not apply (namely, those lines for which the left-hand side vanishes). Here
can depend on
but not on
. More generally, we have
Theorem 1 (Schmidt subspace theorem) Let
be a natural number. Let
be linearly independent linear forms. Then for any
, one has the bound
for all
, outside of a finite number of proper subspaces of
, where
and
depends on
and the
, but is independent of
.
Being a generalisation of the Thue-Siegel-Roth theorem, it is unsurprising that the known proofs of the subspace theorem are also ineffective with regards to the constant . (However, the number of exceptional subspaces may be bounded effectively; cf. the situation with the Skolem-Mahler-Lech theorem, discussed in this previous blog post.) Once again, the lower bound here is basically sharp except for the
factor and the implied constant: given any
with
, a simple volume packing argument (the same one used to prove the Dirichlet approximation theorem) shows that for any sufficiently large
, one can find integers
, not all zero, such that
for all . Thus one can get
comparable to
in many different ways.
There are important generalisations of the subspace theorem to other number fields than the rationals (and to other valuations than the Archimedean valuation ); we will develop one such generalisation below.
The subspace theorem is one of many finiteness theorems in Diophantine geometry; in this case, it is the number of exceptional subspaces which is finite. It turns out that finiteness theorems are very compatible with the language of nonstandard analysis. (See this previous blog post for a review of the basics of nonstandard analysis, and in particular for the nonstandard interpretation of asymptotic notation such as and
.) The reason for this is that a standard set
is finite if and only if it contains no strictly nonstandard elements (that is to say, elements of
). This makes for a clean formulation of finiteness theorems in the nonstandard setting. For instance, the standard form of Bezout’s theorem asserts that if
are coprime polynomials over some field, then the curves
and
intersect in only finitely many points. The nonstandard version of this is then
Theorem 2 (Bezout’s theorem, nonstandard form) Let
be standard coprime polynomials. Then there are no strictly nonstandard solutions to
.
Now we reformulate Theorem 1 in nonstandard language. We need a definition:
Definition 3 (General position) Let
be nested fields. A point
in
is said to be in
-general position if it is not contained in any hyperplane of
definable over
, or equivalently if one has
for any
.
Theorem 4 (Schmidt subspace theorem, nonstandard version) Let
be a standard natural number. Let
be linearly independent standard linear forms. Let
be a tuple of nonstandard integers which is in
-general position (in particular, this forces
to be strictly nonstandard). Then one has
where we extend
from
to
(and also similarly extend
from
to
) in the usual fashion.
Observe that (as is usual when translating to nonstandard analysis) some of the epsilons and quantifiers that are present in the standard version become hidden in the nonstandard framework, being moved inside concepts such as “strictly nonstandard” or “general position”. We remark that as is in
-general position, it is also in
-general position (as an easy Galois-theoretic argument shows), and the requirement that the
are linearly independent is thus equivalent to
being
-linearly independent.
Exercise 1 Verify that Theorem 1 and Theorem 4 are equivalent. (Hint: there are only countably many proper subspaces of
.)
We will not prove the subspace theorem here, but instead focus on a particular application of the subspace theorem, namely to counting integer points on curves. In this paper of Corvaja and Zannier, the subspace theorem was used to give a new proof of the following basic result of Siegel:
Theorem 5 (Siegel’s theorem on integer points) Let
be an irreducible polynomial of two variables, such that the affine plane curve
either has genus at least one, or has at least three points on the line at infinity, or both. Then
has only finitely many integer points
.
This is a finiteness theorem, and as such may be easily converted to a nonstandard form:
Theorem 6 (Siegel’s theorem, nonstandard form) Let
be a standard irreducible polynomial of two variables, such that the affine plane curve
either has genus at least one, or has at least three points on the line at infinity, or both. Then
does not contain any strictly nonstandard integer points
.
Note that Siegel’s theorem can fail for genus zero curves that only meet the line at infinity at just one or two points; the key examples here are the graphs for a polynomial
, and the Pell equation curves
. Siegel’s theorem can be compared with the more difficult theorem of Faltings, which establishes finiteness of rational points (not just integer points), but now needs the stricter requirement that the curve
has genus at least two (to avoid the additional counterexample of elliptic curves of positive rank, which have infinitely many rational points).
The standard proofs of Siegel’s theorem rely on a combination of the Thue-Siegel-Roth theorem and a number of results on abelian varieties (notably the Mordell-Weil theorem). The Corvaja-Zannier argument rebalances the difficulty of the argument by replacing the Thue-Siegel-Roth theorem by the more powerful subspace theorem (in fact, they need one of the stronger versions of this theorem alluded to earlier), while greatly reducing the reliance on results on abelian varieties. Indeed, for curves with three or more points at infinity, no theory from abelian varieties is needed at all, while for the remaining cases, one mainly needs the existence of the Abel-Jacobi embedding, together with a relatively elementary theorem of Chevalley-Weil which is used in the proof of the Mordell-Weil theorem, but is significantly easier to prove.
The Corvaja-Zannier argument (together with several further applications of the subspace theorem) is presented nicely in this Bourbaki expose of Bilu. To establish the theorem in full generality requires a certain amount of algebraic number theory machinery, such as the theory of valuations on number fields, or of relative discriminants between such number fields. However, the basic ideas can be presented without much of this machinery by focusing on simple special cases of Siegel’s theorem. For instance, we can handle irreducible cubics that meet the line at infinity at exactly three points :
Theorem 7 (Siegel’s theorem with three points at infinity) Siegel’s theorem holds when the irreducible polynomial
takes the form
for some quadratic polynomial
and some distinct algebraic numbers
.
Proof: We use the nonstandard formalism. Suppose for sake of contradiction that we can find a strictly nonstandard integer point on a curve
of the indicated form. As this point is infinitesimally close to the line at infinity,
must be infinitesimally close to one of
; without loss of generality we may assume that
is infinitesimally close to
.
We now use a version of the polynomial method, to find some polynomials of controlled degree that vanish to high order on the “arm” of the cubic curve that asymptotes to
. More precisely, let
be a large integer (actually
will already suffice here), and consider the
-vector space
of polynomials
of degree at most
, and of degree at most
in the
variable; this space has dimension
. Also, as one traverses the arm
of
, any polynomial
in
grows at a rate of at most
, that is to say
has a pole of order at most
at the point at infinity
. By performing Laurent expansions around this point (which is a non-singular point of
, as the
are assumed to be distinct), we may thus find a basis
of
, with the property that
has a pole of order at most
at
for each
.
From the control of the pole at , we have
for all . The exponents here become negative for
, and on multiplying them all together we see that
This exponent is negative for large enough (or just take
). If we expand
for some algebraic numbers , then we thus have
for some standard . Note that the
-dimensional vectors
are linearly independent in
, because the
are linearly independent in
. Applying the Schmidt subspace theorem in the contrapositive, we conclude that the
-tuple
is not in
-general position. That is to say, one has a non-trivial constraint of the form
for some standard rational coefficients , not all zero. But, as
is irreducible and cubic in
, it has no common factor with the standard polynomial
, so by Bezout’s theorem (Theorem 2) the constraint (4) only has standard solutions, contradicting the strictly nonstandard nature of
.
Exercise 2 Rewrite the above argument so that it makes no reference to nonstandard analysis. (In this case, the rewriting is quite straightforward; however, there will be a subsequent argument in which the standard version is significantly messier than the nonstandard counterpart, which is the reason why I am working with the nonstandard formalism in this blog post.)
A similar argument works for higher degree curves that meet the line at infinity in three or more points, though if the curve has singularities at infinity then it becomes convenient to rely on the Riemann-Roch theorem to control the dimension of the analogue of the space . Note that when there are only two or fewer points at infinity, though, one cannot get the negative exponent of
needed to usefully apply the subspace theorem. To deal with this case we require some additional tricks. For simplicity we focus on the case of Mordell curves, although it will be convenient to work with more general number fields
than the rationals:
Theorem 8 (Siegel’s theorem for Mordell curves) Let
be a non-zero integer. Then there are only finitely many integer solutions
to
. More generally, for any number field
, and any nonzero
, there are only finitely many algebraic integer solutions
to
, where
is the ring of algebraic integers in
.
Again, we will establish the nonstandard version. We need some additional notation:
Definition 9
We define an almost rational integer to be a nonstandard such that
for some standard positive integer
, and write
for the
-algebra of almost rational integers.
If is a standard number field, we define an almost
-integer to be a nonstandard
such that
for some standard positive integer
, and write
for the
-algebra of almost
-integers.
We define an almost algebraic integer to be a nonstandard such that
is a nonstandard algebraic integer for some standard positive integer
, and write
for the
-algebra of almost algebraic integers.
Theorem 10 (Siegel for Mordell, nonstandard version) Let
be a non-zero standard algebraic number. Then the curve
does not contain any strictly nonstandard almost algebraic integer point.
Another way of phrasing this theorem is that if are strictly nonstandard almost algebraic integers, then
is either strictly nonstandard or zero.
Exercise 3 Verify that Theorem 8 and Theorem 10 are equivalent.
Due to all the ineffectivity, our proof does not supply any bound on the solutions in terms of
, even if one removes all references to nonstandard analysis. It is a conjecture of Hall (a special case of the notorious ABC conjecture) that one has the bound
for all
(or equivalently
), but even the weaker conjecture that
are of polynomial size in
is open. (The best known bounds are of exponential nature, and are proven using a version of Baker’s method: see for instance this text of Sprindzuk.)
A direct repetition of the arguments used to prove Theorem 7 will not work here, because the Mordell curve only hits the line at infinity at one point,
. To get around this we will exploit the fact that the Mordell curve is an elliptic curve and thus has a group law on it. We will then divide all the integer points on this curve by two; as elliptic curves have four 2-torsion points, this will end up placing us in a situation like Theorem 7, with four points at infinity. However, there is an obstruction: it is not obvious that dividing an integer point on the Mordell curve by two will produce another integer point. However, this is essentially true (after enlarging the ring of integers slightly) thanks to a general principle of Chevalley and Weil, which can be worked out explicitly in the case of division by two on Mordell curves by relatively elementary means (relying mostly on unique factorisation of ideals of algebraic integers). We give the details below the fold.
— 1. Dividing by two on the Mordell curve —
This section will be elementary (in the sense that the arguments may be phrased in the first-order language of rings), and so it will make no difference whether we are working in the standard or nonstandard formalism. As such, we make no reference to nonstandard analysis here.
As is well known, any elliptic curve (after adjoining a point
at infinity) has a group law
, with
whenever
are distinct collinear points on
, or
if the tangent line at
meets
at another point
. In the case of the Mordell curve
, the double
of a point
is given by the formulae
as long as . (For
, the double will be the point at infinity.) Geometrically,
is the slope of the tangent line to
at
.
Over the complex numbers, the constraint removes three points
from
, where
is the standard cube root of unity. This thrice punctured curve is no longer an affine plane curve, but one can make it affine again by incorporating
as another coordinate, giving rise to the lifted curve
and the doubling map lifts to the polynomial map
defined by
Over the complex numbers, it is well known that the elliptic curve is isomorphic as a group to the torus
, which implies that the map
is four-to-one. Now we consider the inversion problem explicitly: given a point
, find a point
such that
and
We would also like to avoid using division as much as possible, since in our application will be an integer point, and we would like
to also be “close” to being integers too (in a sense to be made precise later). We can rearrange the above two equations as
and so we see that to find , it suffices to find
, and furthermore if
are “close” to integers, then
are close to integers also (up to a single division by two). One can also check from a routine computation that if (6), (7), (5) hold, then we have
, so that
indeed lies in
.
If one inserts (6), (7) into (5), one arrives at the quartic equation
One could solve this equation by the general quartic formula, but actually the formula simplifies substantially for this specific quartic equation, and can be obtained as follows. The Mordell equation can be rewritten as
We can take square roots and then write
where
As each is determined up to sign, and
is fixed, there are exactly four choices for
here (even when
vanishes). For any such choice, if one sets
then on squaring and rearranging we have
and on squaring again
since
one soon sees that (8) holds. Thus we have found the four solutions to the equation
by choosing the square roots
of
, and then defining
by (9) and
by (6), (7). In particular, if
are (rational) integers, then
are algebraic integers, and
are algebraic integers divided by two.
— 2. The Chevalley-Weil principle —
The Chevalley-Weil principle asserts, roughly speaking, that if one has a “finite cover” of one affine variety by another
, with the varieties and covering maps defined over the rationals, then every integer point in
lifts to a “near”-integer point of
. (There is also a version for projective varieties, which I will not discuss here; one can also generalise integers and rationals to other number fields.) To make the notion of “finite cover” precise, one needs the notion of an etale morphism; to make the notion of “near”-integer precise, we can use the notion of almost algebraic integer. We will not formalise the principle in full generality here; see for instance this text of Lang, this text of Hindry-Silverman, this text of Serre, or this text of Bombieri and Gubler, for a precise statement.
Here, we will focus on simple special cases of the Chevalley-Weil principle, in which one can work “by hand” using quite classical methods from algebraic number theory. We begin with a very elementary example, which in some sense goes all the way back to Diophantus himself:
Theorem 11 (Baby case of Chevalley-Weil) Let
be a non-zero integer, and let
be such that
. Then
are “nearly perfect squares” in the sense that one has
,
for some integers
with
dividing
.
To interpret this result as a special case of the Chevalley-Weil principle, we take to be the affine plane curve
,
to be the affine curve
, with covering map
; thus
is a double cover of
, and the theorem is asserting that integer points in
are covered by near-integer points in
(up to square roots of factors of
).
By transfer, the above theorem also applies in the setting where are non-standard integers and
remains standard; in that case,
now become nonstandard also, but
remain standard. In particular,
and
become almost algebraic integers.
Proof: We can assume that , and hence
and
, are non-zero, since the
case is easily verified. We use the fundamental theorem of arithmetic to factor
into primes
for some distinct primes
and positive natural numbers
, thus
On the other hand, the greatest common divisor of and
is a divisor of
. From this, we see that for each prime
not dividing
,
will divide one of
exactly
times, and not divide the other, whereas for primes
dividing
,
will either divide each of
an even number of times, or an odd number of times. Collecting terms (including the units
in the prime factorisations of
) to form
, the claim follows.
Remark 1 One can reformulate the above argument by using the
-adic valuations
instead of the fundamental theorem of arithmetic. That formulation is more convenient when proving the Chevalley-Weil theorem in full generality.
Now we give a variant of the above theorem which is of relevance for our application. It will be convenient to phrase the variant in the nonstandard setting.
Theorem 12 (Toddler case of Chevalley-Weil) Let
be a non-zero standard algebraic number, and let
be a pair of almost algebraic integers such that
. Then we can write
,
,
, where
are almost algebraic integers.
This is a case of the Chevalley-Weil principle with ,
, and covering map
. The claim is then that any pre-image of an almost algebraic integer point is again an almost algebraic integer point.
Proof: We use essentially the same argument as the previous theorem, but in the language of ideals rather than numbers. The case is easy, so we assume that
. By choosing a suitably large standard number field
, we may assume that
are nonstandard algebraic integers in
for some standard
. As is well known, the ring of algebraic integers
is a Dedekind domain, so that one has unique factorisation of ideals. In particular, the nonstandard principal ideal
, which is an ideal of
, factorises as a nonstandard product
for some distinct nonstandard prime ideals (not necessarily principal), a nonstandard natural number
, and positive nonstandard natural numbers
. Since
, we conclude in particular that
If (say) and
are both divisible by a prime ideal
, then
is also; but the latter ideal is standard, and so such
are standard, and furthermore the number of such
is bounded. Similarly for other pairs from
. Thus for all other
, the
divide exactly one of
an even number of times, and do not divide the other two ideals. This implies a factorisation of the form
where are nonstandard ideals and
are standard ideals. From class field theory, the class group of
is finite, which implies that any nonstandard ideal can be expressed as the product of a nonstandard principal ideal and a standard fractional ideal. Thus we may write
where are now standard fractional ideals. However, the ratio of two principal ideals is a principal fractional ideal, and thus
,
,
for some standard
. In other words, we have
for some nonstandard units . But from
\begin
Let be a standard non-zero algebraic number, and let
be the affine curve
Then does not contain any strictly nonstandard almost algebraic integer point
.
Indeed, Theorem 12 and the calculations from the previous section reveal that if the curve contains a strictly nonstandard almost algebraic integer point, then the lifted curve
does also.
The argument from Theorem 7 can be adapted without much difficulty to show that does not contain any strictly nonstandard point in
. However to work with almost algebraic integers instead of nonstandard integers, we need to extend the subspace theorem to this setting, a topic to which we now turn.
— 3. The subspace theorem in number fields —
If is a nonstandard almost rational integer which is non-zero, then one clearly has
; this is (up to the
exponent) the
case of the subspace theorem, Theorem 4.
Now suppose that is a non-zero nonstandard almost algebraic integer for some standard number field
. Then it is no longer necessarily the case that
; for instance, this would imply that
for any nonstandard integers , which contradicts the Dirichlet approximation theorem. However, if
is a Galois extension of
, and
is the Galois group, then it is still true that
since is a non-zero element of
.
In a similar spirit, we claim the following generalisation of the subspace theorem to number fields. If is an algebraic number of degree
, we write
for the norm at , where
are the Galois conjugates of
, and for
, we write
Clearly this norm is Galois-invariant.
Theorem 13 (Schmidt subspace theorem for number fields) Let
be a standard natural number, and let
be a standard Galois extension of
, with Galois group
. For each
, let
be linearly independent standard linear forms. Let
be in
-general position. Then one has
where
acts on
(and hence on
) in the obvious fashion.
Exercise 4 State a logically equivalent standard form of Theorem 13 (analogous to Theorem 1), and prove the logical equivalence.
Theorem 13 (in the logically equivalent standard form) was proven by Schlickewei (who also handled more general valuations than the complex absolute value , and also allowed the coefficients of
to be
-integers rather than algebraic integers). However, Theorem 13 can also be deduced from using Theorem 1 as a black box, as we shall shortly demonstrate.
Let us see how Theorem 13 implies Theorem 2.
Proof: (of Theorem 2 assuming Theorem 13). This is a modification of the argument of Theorem 7. Suppose for contradiction that we can find a non-zero algebraic number such that the curve
contains a strictly nonstandard almost algebraic integer point . We may place
inside
for some number field
, which we may take to be Galois by enlarging
as necessary. Let
be the Galois group.
For each , the points
are either bounded (i.e.,
) or unbounded. If
is bounded for all
then the coefficients of
are all bounded, so that
is standard, a contradiction. Thus
is unbounded for at least one
.
Let be a large integer to be chosen later (actually
will do here), and consider the vector space
of polynomials
spanned by the monomials
and
This space thus has dimension . We claim that none of the non-zero polynomials
in
vanish identically on
. To see this, we divide into two cases. If
contains at least one monomial with a non-trivial power of
, then we can write
for some
, where
is a non-zero polynomial in
of degree at most
and
has degree less than
in
. By inspecting the limits
,
,
of
, we see that if
vanished on
, then the non-zero polynomial
of degree at most
would have to vanish at all three points
, a contradiction. Finally, if
has no terms involving
, then it is a polynomial in
of degree at most
in
, and thus not divisible by
, and the claim now follows from Bezout’s theorem.
We claim that for every , we may find a basis
for
with the property that
for some standard if
is unbounded (which, as previously observed, must hold at least once). If this claim held, then on multiplying these bounds together and applying Theorem 13, we conclude that the tuple
is not in -general position, which implies that
for some non-zero
. But
does not vanish identically on the irreducible curve
, so by Bezout’s theorem
can only vanish at standard points of
, a contradiction.
It remains to prove the claim. To simplify the notation we just handle the case when is the identity
, as the other cases are treated similarly. If
is bounded, then the claim is trivial (taking
to be an arbitrary basis of
, e.g. the monomial basis), so assume that
is unbounded. From the geometry of
, we see that this can only occur if either
is unbounded and
is infinitesimally close to zero, or else if
is unbounded,
is infinitesimally close to zero, and
is infinitesimally close to one of
,
, or
.
First suppose that is unbounded. Then as
, one can perform a Puiseux series expansion of any polynomial
in powers of
, with the highest power being
. As such, we may find a basis
of
for which
for , which gives (10) for
large enough.
Now suppose that it is which is unbounded. Then as
,
, and
converges to one of
, one can perform a Laurent series expansion of any polynomial in
in powers of
, with the highest power being
. Thus we may find a basis
of
for which
for , and again (10) follows for
large enough. This concludes the proof of Theorem 7.
— 4. Amplifying the subspace theorem —
Now we show how Theorem 4 may be amplified to prove Theorem 13. Our arguments will use very little number theory, relying primarily on linear algebra and the nonstandard analysis framework. (As always, one can reformulate these arguments in the standard setting, but the linear algebra arguments become quite complicated, requiring the use of many rank reduction arguments, as per Section 2 of this previous blog post.
Let be as in Theorem 13. The space
is a
-dimensional vector space over
, and so we may (non-canonically) identify
with
, and
with
. In particular,
can be viewed as a
-tuple of almost rational integers. It would be helpful if we knew that this tuple was in
-general position, as one could then apply Theorem 4 directly to conclude. However, our hypothesis is only the weaker assertion that
is in
-general position. Nevertheless, we may place
in a certain “row echelon form” in terms of
-general position parameters, and we will then be able to deduce Theorem 13 from averaging together various applications of Theorem 4, with the precise applications to use being uncovered through linear algebra.
We turn to the details. Recall that if is an
-dimensional vector space over a field
, then a flag in
is a nested sequence of spaces
such that each is a
-linear subspace of dimension
.
Theorem 14 (Standard row echelon form) Let
be a
-dimensional field extension of a field
, let
be a
-dimensional
-vector space, and let
be an
-linear subspace of
(viewed as an
-dimensional vector space over
). Suppose also that the
-dimension of
is at least
for every non-zero
-linear functional
and some
. Then one may find a flag
of
-subspaces of
, as well as natural numbers
such that
has
-dimension
for
.
Proof: The case is trivial, so assume inductively that
and that the claim has already been proven for
. Let
be a non-zero
-linear functional which minimises the
-dimension of the space
; if we denote the dimension of this space by
, then
. We set
, and let
be the kernel of
, then
is a
-hyperplane in
. We claim that the
-dimension of
is at least
for any non-zero
, for if this were not the case, one could find an extension
of
to
which annihilated a section of
in
, so that
, contradicting the minimality of
. We may thus apply the induction hypothesis to find a flag
and natural numbers
such that such that has
-dimension
for
. The claim follows.
Corollary 15 Let
be a degree
extension of
, and let
be in
-general position. Then one find a
-subspace
of
such that
is an element of the
module
, where
is an arbitrary
-basis of
. Furthermore,
is a
-generic point of
, in the sense that for any
-basis
of
, one has
with
in
-general position. (Equivalently,
does not lie in
for any proper
-subspace
of
.)
Furthermore, one may find linearly independent
-linear forms
and natural numbers
such that for each
, the space
(which is a
-subspace of
) has
-dimension
.
This corollary is a sort of “regularity lemma” for points in ; this is perhaps the component of the argument which is the most difficult to convert into a standard setting.
Proof: Let denote the intersection of all the
-subspaces
of
such that
. Then
is itself a
-subspace of
with
; one can think of
as the
-linear analogue of a “Zariski closure” of
. If
is a basis of
, then we have
for some
; if
is not in
-general position, then we have a non-trivial dependence
for some
, not all zero, and we may use this to place
in
for some proper
-subspace
of
, a contradiction. Thus
is a
-generic point in
.
Since is in
-general position,
is non-trivial for any
-linear form
. From Theorem 14 we may then find a flag
of -subspaces of
, as well as natural numbers
such that has
-dimension
for
. If, for each
, we let
be a
-linear form that annihilates
but does not annihilate
, we obtain the claim.
Next, we need a classical lemma on intersection of flags.
Lemma 16 (Schubert cell decomposition) Let
and
be two flags on the same
-dimensional vector space
over a field
. Then one can find a basis
of
and a permutation
such that
for all
.
Proof: For any , the function
is non-decreasing in
, is equal to
when
, and
when
, thus there exists a unique
such that this function equals
for
and
for
. By inspecting the quantities
for a given
, which similarly increase from
to
, we see that there is at most one
with
, thus
is a permutation. If we then choose
to be an element of
that lies outside of
, we obtain the claim.
Let be as in Theorem 13, and let
and
be as in Corollary 15, with
being the degree of
. By permuting the
, we may assume that
is non-decreasing in
.
For each , we apply Lemma 16 to the flags
and
(with the span being over , and with the
extened from
-linear forms to
-linear forms) we may find a
-basis
of the
-linear forms
on
and a permutation
such that for each
,
is a
-linear combination of
, and is also a
-linear combination of
. From the latter claim, the non-decreasing nature of
, and the triangle inequality we have
It will now suffice to show that
We will prove the inequalities
for all ; since the standard quantities
are finite and non-increasing in
, the claim (12) follows from taking a telescoping product of powers of (13).
To prove (13), we need a classical lemma concerning the canonical embedding of into
via the Galois group action.
Proof: We give a nonstandard analysis proof. If this were not the case, then we would have coefficients for
, not all zero, such that
for all , and hence also for all
. Now let
be nonstandard real numbers for
such that
, and such that no two
are comparable. From the Dirichlet approximation theorem argument given at the start of this post, we can find a non-zero
such that
. On the other hand, the norm
is a non-zero rational integer and so
. Combining the two estimates, we conclude that
for all
. But as the
are incomparable, this forces the
to be linearly independent over
, a contradiction.
Corollary 18 If
is a
-subspace of
of dimension
, then there exist distinct elements
of
such that the
-linear functions
for
are
-linearly independent on
.
Proof: Let be a
-basis for
, which we complete to a
-basis
for
. By Lemma 17, the vectors
for
span
as a
-linear space, and are thus
-linearly independent. In particular, the matrix
has full rank, and so has a minor
which is non-singular, giving the claim.
Now we can prove (13) for a given . For each
, we apply Corollary 18 to the space
defined in (11) to find distinct elements
of
such that the forms
for
are
-linearly independent on
; applying an arbitrary element
of
, we also see that the forms
for
are
-linearly independent on
. We will show that
taking the geometric mean over all , we obtain (13).
It remains to prove (14). By construction of the , we know that the
is a linear combination of the tuple
This tuple lies in , where
is the image of
under the map
. By Corollary 15,
is a
-vector space of dimension
, and
is a
-generic point of
. By Theorem 4 (and the subsequent remarks), we will be able to conclude (14) as soon as we show that the
expressions
for
and
are
-linearly independent. Suppose for contradiction that one has a non-trivial linear dependence
for some , not all zero. Let
be the largest value of
for which there is a non-zero value of
. Each
is a
-linear combination of
, with the coefficient of
being non-zero (otherwise the
could not be linearly independent in
). We conclude that there is a non-trivial
-linear combination of the
for
, which is equal to a
-linear combination of the
for
and
. But as
is a
-generic point in
, this can only happen if the corresponding linear combination of the forms
and
vanishes on
(with the
now interpreted as coordinate functions on
). This implies that the forms
are
-linearly dependent on
, giving the required contradiction. This proves (14), and Theorem 13 follows.
— 5. The flag structure of the magnitude of linear forms —
Although we will not need this fact here, it is interesting to note that the subspace theorem (Theorem 4) gives rather precise control on the size of various linear combinations of algebraic numbers.
Theorem 19 (Schmidt subspace theorem, flag version) Let
be a standard natural number, and let
be in
-general position. Then one has a standard flag
in the space
of linear forms
, as well as standard real numbers
whenever
and
.
For instance, if are linearly independent over
, then Theorem 19 asserts that
for all outside of a one-dimensional subspace
of
, and we have
for all non-zero and some
. One can use Theorem 13 obtain a similar description of the magnitude of
-linear forms (and their Galois conjugates) of a point
in
-general position, but we will not do so here.
Proof: For each standard , the space
is a -linear subspace of
, which is non-decreasing and right-continuous in
, and equals all of
when
. As subspaces of
must have an integer dimension between
and
, we may thus find a complete flag
and extended real numbers
such that whenever
and
. Selecting
to be an element of
for
and applying Theorem 4, we obtain (15), which in particular implies that the
are finite, at which point (16) follows from the definition of the
. Finally, by taking the coordinate linear forms we see that
for at least one
, which forces
.
Exercise 5 Try to state and prove a standard version of Theorem 19. (This is remarkably tricky – it takes the form of a “regularity lemma” in the spirit of Section 2 of this blog post – and may help to illustrate the power of the nonstandard formalism.)
4 comments
Comments feed for this article
7 July, 2014 at 9:00 am
Francisco Javier Thayer
Re: Bezout. Referring to previous posts on Loeb measure, if
then the probability that
are relatively prime is
. This is elementary (Hardy and Wright) and Loeb measure is merely a measure-theoretic interpretation of this. There are analogous results for square-free integers etc. I wonder if there is some way of doing something similar with polynomials?
15 July, 2014 at 4:19 pm
Colin Rust
As written, I think the proof of the theorem of Liouville in the second paragraph is only literally valid for algebraic integers not general irrational algebraic numbers (in which case the product is not guaranteed to be a natural number).
[Corrected, thanks – T.]
12 September, 2017 at 2:12 am
a
Is there an analog of subspace theorem for polynomial forms rather than just linear forms?
29 May, 2020 at 9:54 am
complexsesame
Yes there is, essentially done by Corvaja-Zannier’s paper “On general Thue’s equation”. And there are a lot of further generalizations to projective varieties. But they are mostly limited by somehow reduce to the linear case.