Let {f: {\bf R}^3 \rightarrow {\bf R}} be an irreducible polynomial in three variables. As {{\bf R}} is not algebraically closed, the zero set {Z_{\bf R}(f) = \{ x \in{\bf R}^3: f(x)=0\}} can split into various components of dimension between {0} and {2}. For instance, if {f(x_1,x_2,x_3) = x_1^2+x_2^2}, the zero set {Z_{\bf R}(f)} is a line; more interestingly, if {f(x_1,x_2,x_3) = x_3^2 + x_2^2 - x_2^3}, then {Z_{\bf R}(f)} is the union of a line and a surface (or the product of an acnodal cubic curve with a line). We will assume that the {2}-dimensional component {Z_{{\bf R},2}(f)} is non-empty, thus defining a real surface in {{\bf R}^3}. In particular, this hypothesis implies that {f} is not just irreducible over {{\bf R}}, but is in fact absolutely irreducible (i.e. irreducible over {{\bf C}}), since otherwise one could use the complex factorisation of {f} to contain {Z_{\bf R}(f)} inside the intersection {{\bf Z}_{\bf C}(g) \cap {\bf Z}_{\bf C}(\bar{g})} of the complex zero locus of complex polynomial {g} and its complex conjugate, with {g,\bar{g}} having no common factor, forcing {Z_{\bf R}(f)} to be at most one-dimensional. (For instance, in the case {f(x_1,x_2,x_3)=x_1^2+x_2^2}, one can take {g(z_1,z_2,z_3) = z_1 + i z_2}.) Among other things, this makes {{\bf Z}_{{\bf R},2}(f)} a Zariski-dense subset of {{\bf Z}_{\bf C}(f)}, thus any polynomial identity which holds true at every point of {{\bf Z}_{{\bf R},2}(f)}, also holds true on all of {{\bf Z}_{\bf C}(f)}. This allows us to easily use tools from algebraic geometry in this real setting, even though the reals are not quite algebraically closed.

The surface {Z_{{\bf R},2}(f)} is said to be ruled if, for a Zariski open dense set of points {x \in Z_{{\bf R},2}(f)}, there exists a line {l_x = \{ x+tv_x: t \in {\bf R} \}} through {x} for some non-zero {v_x \in {\bf R}^3} which is completely contained in {Z_{{\bf R},2}(f)}, thus

\displaystyle f(x+tv_x)=0

for all {t \in {\bf R}}. Also, a point {x \in {\bf Z}_{{\bf R},2}(f)} is said to be a flecnode if there exists a line {l_x = \{ x+tv_x: t \in {\bf R}\}} through {x} for some non-zero {v_x \in {\bf R}^3} which is tangent to {Z_{{\bf R},2}(f)} to third order, in the sense that

\displaystyle f(x+tv_x)=O(t^4)

as {t \rightarrow 0}, or equivalently that

\displaystyle \frac{d^j}{dt^j} f(x+tv_x)|_{t=0} = 0 \ \ \ \ \ (1)


for {j=0,1,2,3}. Clearly, if {Z_{{\bf R},2}(f)} is a ruled surface, then a Zariski open dense set of points on {Z_{{\bf R},2}} are a flecnode. We then have the remarkable theorem (discovered first by Monge, and then later by Cayley and Salmon) asserting the converse:

Theorem 1 (Monge-Cayley-Salmon theorem) Let {f: {\bf R}^3 \rightarrow {\bf R}} be an irreducible polynomial with {{\bf Z}_{{\bf R},2}} non-empty. Suppose that a Zariski dense set of points in {Z_{{\bf R},2}(f)} are flecnodes. Then {Z_{{\bf R},2}(f)} is a ruled surface.

Among other things, this theorem was used in the celebrated result of Guth and Katz that almost solved the Erdos distance problem in two dimensions, as discussed in this previous blog post. Vanishing to third order is necessary: observe that in a surface of negative curvature, such as the saddle {\{ (x_1,x_2,x_3): x_3 = x_1^2 - x_2^2 \}}, every point on the surface is tangent to second order to a line (the line in the direction for which the second fundamental form vanishes). This surface happens to be ruled, but a generic perturbation of this surface (e.g. {x_3 = x_1^2 - x_2^2 + x_2^4}) will no longer be ruled, although it is still negative curvature near the origin.

The original proof of the Monge-Cayley-Salmon theorem is not easily accessible and not written in modern language. A modern proof of this theorem (together with substantial generalisations, for instance to higher dimensions) is given by Landsberg; the proof uses the machinery of modern algebraic geometry. The purpose of this post is to record an alternate proof of the Monge-Cayley-Salmon theorem based on classical differential geometry (in particular, the notion of torsion of a curve) and basic ODE methods (in particular, Gronwall’s inequality and the Picard existence theorem). The idea is to “integrate” the lines {l_x} indicated by the flecnode to produce smooth curves {\gamma} on the surface {{\bf Z}_{{\bf R},2}}; one then uses the vanishing (1) and some basic calculus to conclude that these curves have zero torsion and are thus planar curves. Some further manipulation using (1) (now just to second order instead of third) then shows that these curves are in fact straight lines, giving the ruling on the surface.

Update: Janos Kollar has informed me that the above theorem was essentially known to Monge in 1809; see his recent arXiv note for more details.

I thank Larry Guth and Micha Sharir for conversations leading to this post.

— 1. Proof —

Let {M} denote the smooth points of {Z_{{\bf R},2}(f)}, then {M} is a smooth surface that is a Zariski open dense subset of {Z_{{\bf R},2}(f)}, and hence Zariski dense in {Z_{\bf C}(f)}. We consider the projective tangent bundle {PTM} of {M}; this is a smooth three-dimensional manifold, which is a bundle of copies of the projective line {P^1} over {M}, with elements {(x, [v_x])} consisting of a point {x} in {M} and the projective class of a direction {v_x} that is tangent to {M} at {x} and is non-zero. Since {P^1} and {M} are both irreducible varieties, it is easy to see that {PTM} is also an irreducible variety.

Inside {PTM}, we consider the subset {Flec} of points {(x,[v])} which obey the flecnode condition (1) for {j=0,1,2,3}. By hypothesis, the projection of {Flec} to {M} is Zariski dense. On the other hand, {Flec} is clearly an algebraic set. Thus the dimension of {Flec} is at least {2}, and there is at least one component whose projection to {M} is two-dimensional (i.e. is dominant). In particular we can find an irreducible algebraic surface {S} in {Flec} whose projection to {M} is open dense (not just in the Zariski sense, but also in the differential geometry sense). By removing the singular points of {S}, we may assume that {S} is a smooth surface.

We now claim that the projection map {\pi: S \rightarrow M} is generically a local diffeomorphism, thus {D\pi(x,[v_x])} has full rank for a Zariski dense set of points {(x,[v_x])} in {S}. This is a simple consequence of Sard’s theorem, but for our purposes it is also instructive to see an ODE proof: if {D\pi} fails to have full rank generically, then it must have rank one generically or rank zero generically. If it has rank one generically, one can use the Picard existence theorem to locally foliate an open dense subset of {S} by curves {\gamma: (-\epsilon,\epsilon) \rightarrow S} with the property that for each {t \in(-\epsilon,\epsilon)}, the derivative {\gamma'(t)} lies in the kernel of {D\pi(\gamma(t))}, so that if we write {\gamma(t) = (x(t), [v_x(t)])}, then {x'(t) = 0} for all {t}, and so {x(t)} is constant; thus the curves each lie in a single fibre of {\pi}. This locally describes {S} as a one-dimensional smooth family of curves inside the fibre of {\pi}, and so the image {\pi(S)} is locally one-dimensional, contradicting the two-dimensional nature of {\pi(S)}. A similar argument works when {D\pi} has rank zero generically.

Since {\pi: S \rightarrow M} is a local diffeomorphism generically, we may apply the inverse function theorem to conclude that on an open dense subset of {M}, we can locally invert this map, which in particular gives smooth local maps {x \mapsto v_x} from open subsets of {M} to unit tangent vectors {v_x} at {x} such that the flecnode condition (1) is satisfied for all such {x} and {j=0,1,2,3}.

By the Picard existence theorem, we may thus locally foliate {M} by curves {\gamma: (-\epsilon,\epsilon) \rightarrow M} with the property that

\displaystyle \gamma'(t) = v_{\gamma(t)}

for all {t \in (-\epsilon,\epsilon)}; thus {\gamma} has unit speed and is always tangent to a flecnode direction. Thus, by (1) we have

\displaystyle \frac{d^j}{ds^j} f( \gamma(t) + s \gamma'(t) )|_{s=0} = 0

for {j=0,1,2,3}. Expanding this out in coordinates by the chain rule (and using the usual summation conventions), using {\gamma^1,\gamma^2,\gamma^3} to denote the components of {\gamma}, and {f_i = \frac{d}{dx_i} f(x)} to denote the first partial derivatives of {f} for {i=1,2,3}, {f_{ij} = \frac{d^2}{dx_i dx_j} f(x)} to denote the second partial derivatives, and so forth, we have

\displaystyle f( \gamma(t) ) = 0 \ \ \ \ \ (2)


\displaystyle \gamma'(t)^i f_i( \gamma(t) ) = 0 \ \ \ \ \ (3)


\displaystyle \gamma'(t)^i \gamma'(t)^j f_{ij}( \gamma(t) ) = 0 \ \ \ \ \ (4)



\displaystyle \gamma'(t)^i \gamma'(t)^j \gamma'(t)^k f_{ijk}( \gamma(t) ) = 0. \ \ \ \ \ (5)


We can obtain further differential equations by differentiating the above equations in {t}. For instance, if we differentiate (3) in {t} we obtain

\displaystyle \gamma''(t)^i f_i(\gamma(t)) + \gamma'(t)^i \gamma'(t)^j f_{ij}(\gamma(t)) = 0

and hence by (4)

\displaystyle \gamma''(t)^i f_i(\gamma(t)) = 0. \ \ \ \ \ (6)


Similarly, if we differentiate (4) in {t} we obtain

\displaystyle 2 \gamma''(t)^i \gamma'(t)^j f_{ij}(\gamma(t)) + \gamma'(t)^i \gamma'(t)^j \gamma'(t)^k f_{ijk}( \gamma(t) ) = 0

and hence by (5)

\displaystyle \gamma''(t)^i \gamma'(t)^j f_{ij}(\gamma(t)) = 0. \ \ \ \ \ (7)


Finally, if we differentiate (6) in {t} we obtain

\displaystyle \gamma'''(t)^i f_i(\gamma(t)) + \gamma''(t)^i \gamma'(t)^j f_{ij}(\gamma(t)) = 0

and hence by (7)

\displaystyle \gamma'''(t)^i f_i(\gamma(t)) = 0. \ \ \ \ \ (8)


The equations (3), (6), (8) have a simple geometric interpretation: the first three derivatives {\gamma'(t), \gamma''(t), \gamma'''(t)} are all orthogonal to the gradient {\nabla f(\gamma(t))}. Generically, this gradient is non-zero, and we are in three dimensions, so we conclude that {\gamma'(t), \gamma''(t), \gamma'''(t)} are always coplanar. Equivalently, the torsion of the curve {\gamma} vanishes, and hence the curve {\gamma} is necessarily planar (locally, at least). Another way to see this is to start with the identity

\displaystyle \partial_t ( \gamma'(t) \times \gamma''(t) ) = \gamma'(t) \times \gamma'''(t),

where {\times} is the cross product, and conclude that {\partial_t ( \gamma'(t) \times \gamma''(t) )} is a scalar multiple of {\gamma'(t) \times \gamma''(t)} whenever it is non-vanishing, which by Gronwall’s inequality shows that {\gamma'(t) \times \gamma''(t)} has fixed orientation whenever it is non-vanishing.

So there is a plane {P} in {{\bf R}^3} in which {\gamma} locally lies. If {f} vanished on this plane, then {Z_{{\bf R},2}(f)}, being irreducible, would be just {P} and we would be done, so we may assume that {f} is non-vanishing here, thus {Z_{{\bf R},2}(f) \cap P} is at most one-dimensional. On the other hand, (3), (6) show that {\gamma', \gamma''} are both orthogonal to the gradient of {f} restricted to {P}, which is generically non-zero; as we now only have two dimensions, this implies that {\gamma', \gamma''} are parallel. Thus the curvature of {\gamma} now also vanishes, which implies that {\gamma} is a straight line. Hence we have locally foliated at least a small open neighbourhood in {M} by straight lines, which ensures that {M} is ruled as desired.