You are currently browsing the tag archive for the ‘subspace theorem’ tag.

Let {\bar{{\bf Q}}} be the algebraic closure of {{\bf Q}}, that is to say the field of algebraic numbers. We fix an embedding of {\bar{{\bf Q}}} into {{\bf C}}, giving rise to a complex absolute value {z \mapsto |z|} for algebraic numbers {z \in \bar{{\bf Q}}}.

Let {\alpha \in \bar{{\bf Q}}} be of degree {D > 1}, so that {\alpha} is irrational. A classical theorem of Liouville gives the quantitative bound

\displaystyle  |\alpha - \frac{p}{q}| \geq c \frac{1}{|q|^D} \ \ \ \ \ (1)

for the irrationality of {\alpha} fails to be approximated by rational numbers {p/q}, where {c>0} depends on {\alpha,D} but not on {p,q}. Indeed, if one lets {\alpha = \alpha_1, \alpha_2, \dots, \alpha_D} be the Galois conjugates of {\alpha}, then the quantity {\prod_{i=1}^D |q \alpha_i - p|} is a non-zero natural number divided by a constant, and so we have the trivial lower bound

\displaystyle  \prod_{i=1}^D |q \alpha_i - p| \geq c

from which the bound (1) easily follows. A well known corollary of the bound (1) is that Liouville numbers are automatically transcendental.

The famous theorem of Thue, Siegel and Roth improves the bound (1) to

\displaystyle  |\alpha - \frac{p}{q}| \geq c \frac{1}{|q|^{2+\epsilon}} \ \ \ \ \ (2)

for any {\epsilon>0} and rationals {\frac{p}{q}}, where {c>0} depends on {\alpha,\epsilon} but not on {p,q}. Apart from the {\epsilon} in the exponent and the implied constant, this bound is optimal, as can be seen from Dirichlet’s theorem. This theorem is a good example of the ineffectivity phenomenon that affects a large portion of modern number theory: the implied constant in the {\gg} notation is known to be finite, but there is no explicit bound for it in terms of the coefficients of the polynomial defining {\alpha} (in contrast to (1), for which an effective bound may be easily established). This is ultimately due to the reliance on the “dueling conspiracy” (or “repulsion phenomenon”) strategy. We do not as yet have a good way to rule out one counterexample to (2), in which {\frac{p}{q}} is far closer to {\alpha} than {\frac{1}{|q|^{2+\epsilon}}}; however we can rule out two such counterexamples, by playing them off of each other.

A powerful strengthening of the Thue-Siegel-Roth theorem is given by the subspace theorem, first proven by Schmidt and then generalised further by several authors. To motivate the theorem, first observe that the Thue-Siegel-Roth theorem may be rephrased as a bound of the form

\displaystyle  | \alpha p - \beta q | \times | \alpha' p - \beta' q | \geq c (1 + |p| + |q|)^{-\epsilon} \ \ \ \ \ (3)

for any algebraic numbers {\alpha,\beta,\alpha',\beta'} with {(\alpha,\beta)} and {(\alpha',\beta')} linearly independent (over the algebraic numbers), and any {(p,q) \in {\bf Z}^2} and {\epsilon>0}, with the exception when {\alpha,\beta} or {\alpha',\beta'} are rationally dependent (i.e. one is a rational multiple of the other), in which case one has to remove some lines (i.e. subspaces in {{\bf Q}^2}) of rational slope from the space {{\bf Z}^2} of pairs {(p,q)} to which the bound (3) does not apply (namely, those lines for which the left-hand side vanishes). Here {c>0} can depend on {\alpha,\beta,\alpha',\beta',\epsilon} but not on {p,q}. More generally, we have

Theorem 1 (Schmidt subspace theorem) Let {d} be a natural number. Let {L_1,\dots,L_d: \bar{{\bf Q}}^d \rightarrow \bar{{\bf Q}}} be linearly independent linear forms. Then for any {\epsilon>0}, one has the bound

\displaystyle  \prod_{i=1}^d |L_i(x)| \geq c (1 + \|x\| )^{-\epsilon}

for all {x \in {\bf Z}^d}, outside of a finite number of proper subspaces of {{\bf Q}^d}, where

\displaystyle  \| (x_1,\dots,x_d) \| := \max( |x_1|, \dots, |x_d| )

and {c>0} depends on {\epsilon, d} and the {\alpha_{i,j}}, but is independent of {x}.

Being a generalisation of the Thue-Siegel-Roth theorem, it is unsurprising that the known proofs of the subspace theorem are also ineffective with regards to the constant {c}. (However, the number of exceptional subspaces may be bounded effectively; cf. the situation with the Skolem-Mahler-Lech theorem, discussed in this previous blog post.) Once again, the lower bound here is basically sharp except for the {\epsilon} factor and the implied constant: given any {\delta_1,\dots,\delta_d > 0} with {\delta_1 \dots \delta_d = 1}, a simple volume packing argument (the same one used to prove the Dirichlet approximation theorem) shows that for any sufficiently large {N \geq 1}, one can find integers {x_1,\dots,x_d \in [-N,N]}, not all zero, such that

\displaystyle  |L_i(x)| \ll \delta_i

for all {i=1,\dots,d}. Thus one can get {\prod_{i=1}^d |L_i(x)|} comparable to {1} in many different ways.

There are important generalisations of the subspace theorem to other number fields than the rationals (and to other valuations than the Archimedean valuation {z \mapsto |z|}); we will develop one such generalisation below.

The subspace theorem is one of many finiteness theorems in Diophantine geometry; in this case, it is the number of exceptional subspaces which is finite. It turns out that finiteness theorems are very compatible with the language of nonstandard analysis. (See this previous blog post for a review of the basics of nonstandard analysis, and in particular for the nonstandard interpretation of asymptotic notation such as {\ll} and {o()}.) The reason for this is that a standard set {X} is finite if and only if it contains no strictly nonstandard elements (that is to say, elements of {{}^* X \backslash X}). This makes for a clean formulation of finiteness theorems in the nonstandard setting. For instance, the standard form of Bezout’s theorem asserts that if {P(x,y), Q(x,y)} are coprime polynomials over some field, then the curves {\{ (x,y): P(x,y) = 0\}} and {\{ (x,y): Q(x,y)=0\}} intersect in only finitely many points. The nonstandard version of this is then

Theorem 2 (Bezout’s theorem, nonstandard form) Let {P(x,y), Q(x,y)} be standard coprime polynomials. Then there are no strictly nonstandard solutions to {P(x,y)=Q(x,y)=0}.

Now we reformulate Theorem 1 in nonstandard language. We need a definition:

Definition 3 (General position) Let {K \subset L} be nested fields. A point {x = (x_1,\dots,x_d)} in {L^d} is said to be in {K}-general position if it is not contained in any hyperplane of {L^d} definable over {K}, or equivalently if one has

\displaystyle  a_1 x_1 + \dots + a_d x_d = 0 \iff a_1=\dots = a_d = 0

for any {a_1,\dots,a_d \in K}.

Theorem 4 (Schmidt subspace theorem, nonstandard version) Let {d} be a standard natural number. Let {L_1,\dots,L_d: \bar{{\bf Q}}^d \rightarrow \bar{{\bf Q}}} be linearly independent standard linear forms. Let {x \in {}^* {\bf Z}^d} be a tuple of nonstandard integers which is in {{\bf Q}}-general position (in particular, this forces {x} to be strictly nonstandard). Then one has

\displaystyle  \prod_{i=1}^d |L_i(x)| \gg \|x\|^{-o(1)},

where we extend {L_i} from {\bar{{\bf Q}}} to {{}^* \bar{{\bf Q}}} (and also similarly extend {\| \|} from {{\bf Z}^d} to {{}^* {\bf Z}^d}) in the usual fashion.

Observe that (as is usual when translating to nonstandard analysis) some of the epsilons and quantifiers that are present in the standard version become hidden in the nonstandard framework, being moved inside concepts such as “strictly nonstandard” or “general position”. We remark that as {x} is in {{\bf Q}}-general position, it is also in {\bar{{\bf Q}}}-general position (as an easy Galois-theoretic argument shows), and the requirement that the {L_1,\dots,L_d} are linearly independent is thus equivalent to {L_1(x),\dots,L_d(x)} being {\bar{{\bf Q}}}-linearly independent.

Exercise 1 Verify that Theorem 1 and Theorem 4 are equivalent. (Hint: there are only countably many proper subspaces of {{\bf Q}^d}.)

We will not prove the subspace theorem here, but instead focus on a particular application of the subspace theorem, namely to counting integer points on curves. In this paper of Corvaja and Zannier, the subspace theorem was used to give a new proof of the following basic result of Siegel:

Theorem 5 (Siegel’s theorem on integer points) Let {P \in {\bf Q}[x,y]} be an irreducible polynomial of two variables, such that the affine plane curve {C := \{ (x,y): P(x,y)=0\}} either has genus at least one, or has at least three points on the line at infinity, or both. Then {C} has only finitely many integer points {(x,y) \in {\bf Z}^2}.

This is a finiteness theorem, and as such may be easily converted to a nonstandard form:

Theorem 6 (Siegel’s theorem, nonstandard form) Let {P \in {\bf Q}[x,y]} be a standard irreducible polynomial of two variables, such that the affine plane curve {C := \{ (x,y): P(x,y)=0\}} either has genus at least one, or has at least three points on the line at infinity, or both. Then {C} does not contain any strictly nonstandard integer points {(x_*,y_*) \in {}^* {\bf Z}^2 \backslash {\bf Z}^2}.

Note that Siegel’s theorem can fail for genus zero curves that only meet the line at infinity at just one or two points; the key examples here are the graphs {\{ (x,y): y - f(x) = 0\}} for a polynomial {f \in {\bf Z}[x]}, and the Pell equation curves {\{ (x,y): x^2 - dy^2 = 1 \}}. Siegel’s theorem can be compared with the more difficult theorem of Faltings, which establishes finiteness of rational points (not just integer points), but now needs the stricter requirement that the curve {C} has genus at least two (to avoid the additional counterexample of elliptic curves of positive rank, which have infinitely many rational points).

The standard proofs of Siegel’s theorem rely on a combination of the Thue-Siegel-Roth theorem and a number of results on abelian varieties (notably the Mordell-Weil theorem). The Corvaja-Zannier argument rebalances the difficulty of the argument by replacing the Thue-Siegel-Roth theorem by the more powerful subspace theorem (in fact, they need one of the stronger versions of this theorem alluded to earlier), while greatly reducing the reliance on results on abelian varieties. Indeed, for curves with three or more points at infinity, no theory from abelian varieties is needed at all, while for the remaining cases, one mainly needs the existence of the Abel-Jacobi embedding, together with a relatively elementary theorem of Chevalley-Weil which is used in the proof of the Mordell-Weil theorem, but is significantly easier to prove.

The Corvaja-Zannier argument (together with several further applications of the subspace theorem) is presented nicely in this Bourbaki expose of Bilu. To establish the theorem in full generality requires a certain amount of algebraic number theory machinery, such as the theory of valuations on number fields, or of relative discriminants between such number fields. However, the basic ideas can be presented without much of this machinery by focusing on simple special cases of Siegel’s theorem. For instance, we can handle irreducible cubics that meet the line at infinity at exactly three points {[1,\alpha_1,0], [1,\alpha_2,0], [1,\alpha_3,0]}:

Theorem 7 (Siegel’s theorem with three points at infinity) Siegel’s theorem holds when the irreducible polynomial {P(x,y)} takes the form

\displaystyle  P(x,y) = (y - \alpha_1 x) (y - \alpha_2 x) (y - \alpha_3 x) + Q(x,y)

for some quadratic polynomial {Q \in {\bf Q}[x,y]} and some distinct algebraic numbers {\alpha_1,\alpha_2,\alpha_3}.

Proof: We use the nonstandard formalism. Suppose for sake of contradiction that we can find a strictly nonstandard integer point {(x_*,y_*) \in {}^* {\bf Z}^2 \backslash {\bf Z}^2} on a curve {C := \{ (x,y): P(x,y)=0\}} of the indicated form. As this point is infinitesimally close to the line at infinity, {y_*/x_*} must be infinitesimally close to one of {\alpha_1,\alpha_2,\alpha_3}; without loss of generality we may assume that {y_*/x_*} is infinitesimally close to {\alpha_1}.

We now use a version of the polynomial method, to find some polynomials of controlled degree that vanish to high order on the “arm” of the cubic curve {C} that asymptotes to {[1,\alpha_1,0]}. More precisely, let {D \geq 3} be a large integer (actually {D=3} will already suffice here), and consider the {\bar{{\bf Q}}}-vector space {V} of polynomials {R(x,y) \in \bar{{\bf Q}}[x,y]} of degree at most {D}, and of degree at most {2} in the {y} variable; this space has dimension {3D}. Also, as one traverses the arm {y/x \rightarrow \alpha_1} of {C}, any polynomial {R} in {V} grows at a rate of at most {D}, that is to say {R} has a pole of order at most {D} at the point at infinity {[1,\alpha_1,0]}. By performing Laurent expansions around this point (which is a non-singular point of {C}, as the {\alpha_i} are assumed to be distinct), we may thus find a basis {R_1, \dots, R_{3D}} of {V}, with the property that {R_j} has a pole of order at most {D+1-j} at {[1,\alpha_1,0]} for each {j=1,\dots,3D}.

From the control of the pole at {[1,\alpha_1,0]}, we have

\displaystyle  |R_j(x_*,y_*)| \ll (|x_*|+|y_*|)^{D+1-j}

for all {j=1,\dots,3D}. The exponents here become negative for {j > D+1}, and on multiplying them all together we see that

\displaystyle  \prod_{j=1}^{3D} |R_j(x_*,y_*)| \ll (|x_*|+|y_*|)^{3D(D+1) - \frac{3D(3D+1)}{2}}.

This exponent is negative for {D} large enough (or just take {D=3}). If we expand

\displaystyle  R_j(x_*,y_*) = \sum_{a+b \leq D; b \leq 2} \alpha_{j,a,b} x_*^a y_*^b

for some algebraic numbers {\alpha_{j,a,b}}, then we thus have

\displaystyle  \prod_{j=1}^{3D} |\sum_{a+b \leq D; b \leq 2} \alpha_{j,a,b} x_*^a y_*^b| \ll (|x_*|+|y_*|)^{-\epsilon}

for some standard {\epsilon>0}. Note that the {3D}-dimensional vectors {(\alpha_{j,a,b})_{a+b \leq D; b \leq 2}} are linearly independent in {{\bf C}^{3D}}, because the {R_j} are linearly independent in {V}. Applying the Schmidt subspace theorem in the contrapositive, we conclude that the {3D}-tuple {( x_*^a y_*^b )_{a+b \leq D; b \leq 2} \in {}^* {\bf Z}^{3D}} is not in {{\bf Q}}-general position. That is to say, one has a non-trivial constraint of the form

\displaystyle  \sum_{a+b \leq D; b \leq 2} c_{a,b} x_*^a y_*^b = 0 \ \ \ \ \ (4)

for some standard rational coefficients {c_{a,b}}, not all zero. But, as {P} is irreducible and cubic in {y}, it has no common factor with the standard polynomial {\sum_{a+b \leq D; b \leq 2} c_{a,b} x^a y^b}, so by Bezout’s theorem (Theorem 2) the constraint (4) only has standard solutions, contradicting the strictly nonstandard nature of {(x_*,y_*)}. \Box

Exercise 2 Rewrite the above argument so that it makes no reference to nonstandard analysis. (In this case, the rewriting is quite straightforward; however, there will be a subsequent argument in which the standard version is significantly messier than the nonstandard counterpart, which is the reason why I am working with the nonstandard formalism in this blog post.)

A similar argument works for higher degree curves that meet the line at infinity in three or more points, though if the curve has singularities at infinity then it becomes convenient to rely on the Riemann-Roch theorem to control the dimension of the analogue of the space {V}. Note that when there are only two or fewer points at infinity, though, one cannot get the negative exponent of {-\epsilon} needed to usefully apply the subspace theorem. To deal with this case we require some additional tricks. For simplicity we focus on the case of Mordell curves, although it will be convenient to work with more general number fields {{\bf Q} \subset K \subset \bar{{\bf Q}}} than the rationals:

Theorem 8 (Siegel’s theorem for Mordell curves) Let {k} be a non-zero integer. Then there are only finitely many integer solutions {(x,y) \in {\bf Z}^2} to {y^2 - x^3 = k}. More generally, for any number field {K}, and any nonzero {k \in K}, there are only finitely many algebraic integer solutions {(x,y) \in {\mathcal O}_K^2} to {y^2-x^3=k}, where {{\mathcal O}_K} is the ring of algebraic integers in {K}.

Again, we will establish the nonstandard version. We need some additional notation:

Definition 9

  • We define an almost rational integer to be a nonstandard {x \in {}^* {\bf Q}} such that {Mx \in {}^* {\bf Z}} for some standard positive integer {M}, and write {{\bf Q} {}^* {\bf Z}} for the {{\bf Q}}-algebra of almost rational integers.
  • If {K} is a standard number field, we define an almost {K}-integer to be a nonstandard {x \in {}^* K} such that {Mx \in {}^* {\mathcal O}_K} for some standard positive integer {M}, and write {K {}^* {\bf Z} = K {\mathcal O}_K} for the {K}-algebra of almost {K}-integers.
  • We define an almost algebraic integer to be a nonstandard {x \in {}^* {\bar Q}} such that {Mx} is a nonstandard algebraic integer for some standard positive integer {M}, and write {\bar{{\bf Q}} {}^* {\bf Z}} for the {\bar{{\bf Q}}}-algebra of almost algebraic integers.
  • Theorem 10 (Siegel for Mordell, nonstandard version) Let {k} be a non-zero standard algebraic number. Then the curve {\{ (x,y): y^2 - x^3 = k \}} does not contain any strictly nonstandard almost algebraic integer point.

    Another way of phrasing this theorem is that if {x,y} are strictly nonstandard almost algebraic integers, then {y^2-x^3} is either strictly nonstandard or zero.

    Exercise 3 Verify that Theorem 8 and Theorem 10 are equivalent.

    Due to all the ineffectivity, our proof does not supply any bound on the solutions {x,y} in terms of {k}, even if one removes all references to nonstandard analysis. It is a conjecture of Hall (a special case of the notorious ABC conjecture) that one has the bound {|x| \ll_\epsilon |k|^{2+\epsilon}} for all {\epsilon>0} (or equivalently {|y| \ll_\epsilon |k|^{3+\epsilon}}), but even the weaker conjecture that {x,y} are of polynomial size in {k} is open. (The best known bounds are of exponential nature, and are proven using a version of Baker’s method: see for instance this text of Sprindzuk.)

    A direct repetition of the arguments used to prove Theorem 7 will not work here, because the Mordell curve {\{ (x,y): y^2 - x^3 = k \}} only hits the line at infinity at one point, {[0,1,0]}. To get around this we will exploit the fact that the Mordell curve is an elliptic curve and thus has a group law on it. We will then divide all the integer points on this curve by two; as elliptic curves have four 2-torsion points, this will end up placing us in a situation like Theorem 7, with four points at infinity. However, there is an obstruction: it is not obvious that dividing an integer point on the Mordell curve by two will produce another integer point. However, this is essentially true (after enlarging the ring of integers slightly) thanks to a general principle of Chevalley and Weil, which can be worked out explicitly in the case of division by two on Mordell curves by relatively elementary means (relying mostly on unique factorisation of ideals of algebraic integers). We give the details below the fold.

    Read the rest of this entry »

    Archives