On “compilation errors” in mathematical reading, and how to resolve them

Computers are notorious for interpreting language in an overly literal fashion; a single misplaced parenthesis in an otherwise flawless piece of software code can cause a computer to halt in utter incomprehension halfway through the compilation of that code.Humans, when reading natural language, tend to be far more robust at this; once one is fluent in, say, English, one can usually deal with a reasonable number of spelling or grammatical errors in a text, particularly when the writing style is clear and organised, and the themes of the text are familiar to the reader.However, when, as a graduate student, one encounters the task of reading a technical mathematical paper for the first time, it is often the case that one loses much of one’s higher reading skills, reverting instead to a more formal and tedious line-by-line interpretation of the text. As a consequence, a single typo or undefined term in the paper can cause one’s comprehension of the paper to grind to a complete halt, in much the same way that it would to a computer.In many cases, such “compilation errors” can be resolved simply by reading ahead in the paper. In some cases, just reading the next one or two lines can shed a lot of light on the mysterious term that was just introduced, or the unexplained step in the logic. In other cases, one has to read a fair bit further ahead; if, for instance, the conclusion of Lemma 15 was difficult to understand, one can read ahead to the end of the proof of that Lemma (in which, presumably, the conclusion is obtained), or search ahead to, say, Proposition 23, in which Lemma 15 is invoked, to get more clues as to what Lemma 15 is trying to say. (The use of search functions in, say, a PDF reader, is particularly useful in this regard.)

It is also good to keep in mind that no author is infallible, and that in some cases, the simplest explanation for incomprehension is that there is a typo in the text. For instance, suppose a paper states that “Since $A$ is true, $B$ is true”, but when one works things out, one cannot quite deduce $B$ from $A$ , but instead can only achieve a slightly different conclusion $B'$ . A bit later on in the text, the paper states that “Since $B$ is true, then $C$ is true”, but again one has difficulty deducing $C$ from $B$ . Here, the most likely diagnosis is that the author actually meant to write $B'$ instead of $B$ in both places.

In a similar spirit, if the paper contains a cryptic comment which you didn’t quite understand, but chose to ignore in order to move on, and then two lines later you find a deduction of a conclusion which you don’t see to be a consequence of the previous statements, then one should go back to the cryptic comment and parse it very carefully, as it is likely to be a description of the missing hypothesis or technique needed to reach the stated conclusion.

Sometimes one has to look for the absence of key words, rather than their presence. Suppose for instance the statement $A$ is asserted in a paper, followed shortly by a statement $B$ . You understand how $A$ is deduced, but you see no way to use $A$ to derive $B$ . But were there key words such as “thus”, “therefore”, or “consequently” that actually indicated that $A$ was to be used to derive $B$ ? If not, then what is likely happening here is that $B$ is being derived from some other source than $A$ , and a rereading of the text near or immediately preceding $A$ and $B$ with this in mind may then reveal how $B$ is to be established.

Another useful trick is to “project” the paper down to a simpler and shorter paper by restricting attention to a simpler special case, or by adopting some heuristic that allows one to trivialise some technical portions of the paper (or at least make some steps of the paper plausible enough to the reader that one is willing to skip over the details of proof for those steps). For instance, if the paper is dealing with a result in general dimension, one might first specialise the paper to one dimension (even if this means that the main results are no longer new, but consequences of previous literature). Or, if the paper has to analyse both the main term in an expression as well as error terms, one can adopt the heuristic that all error terms are negligible and only focus on the main term (or dually, one can accept that the main term is always going to compute out to the correct answer, and only focus on controlling error terms). If one is aware of a near-counterexample to the main result, specialising the paper to that near-counterexample (or to a hypothetical perturbation of that near-counterexample that is trying to be a genuine counterexample) is often quite instructive. Ideally, one should project away roughly half of the difficulties of the paper, leaving behind a paper which is twice as simple, and thus presumably much easier to understand; once this is done, one can undo the projection, and return to the original paper, which is now already half understood, and again much easier to understand than before one understood the projected paper. (The difficulty of reading a paper usually increases in a super-linear fashion with the complexity of the paper, so factoring the paper into two sub-papers, each with half the complexity, is often an efficient way to proceed.)

Finally, and perhaps most importantly, reading becomes much easier when one can somehow “get into the author’s head”, and get a sense of what the author is trying to do with each statement or lemma in the paper, rather than focusing purely on the literal statements in the text. A good author will interleave the mathematical text with commentary that is designed to do exactly this, but even without such explicit clues, one can often get a sense of the purpose of each component of the paper by comparing it with similar components in other papers, or by seeing how such a component is used in the rest of the paper. In extreme cases, one may have to go to a large blackboard and diagram all the logical dependencies of a paper (e.g. if Lemma 6 and Lemma 8 are used to prove Theorem 10, one can draw arrows between boxes bearing these names accordingly) to get some sense of what the key steps in the paper are.

For some further principles on how to justify a particularly fearsome looking step in a paper, see this MathOverflow answer of mine.

For an analogous technique in the dual problem of writing a paper, see this page.

1 comment

Comments feed for this article

5 April, 2024 at 12:34 am

lackhoa

Can you read math papers without essentially rewriting the whole paper in a your familiar notations? I find that it is impossible for me to comprehend papers if I just *look* at the text without doing anything.

	Anonymous on Infinite partial sumsets in th…
	Anonymous on A Banach algebra proof of the…
	Anonymous on A Banach algebra proof of the…
	Aleksandar on 245C, Notes 4: Sobolev sp…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Terence Tao on 245C, Notes 4: Sobolev sp…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on What is a gauge?
	Terence Tao on Erratum for “An inverse…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on An epsilon of room: pages from…
	Aleksandar on 245C, Notes 4: Sobolev sp…

On “compilation errors” in mathematical reading, and how to resolve them

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

1 comment

Leave a comment Cancel reply

For commenters

On “compilation errors” in mathematical reading, and how to resolve them

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

1 comment

Leave a comment Cancel reply

For commenters