You are currently browsing the category archive for the ‘non-technical’ category.

Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials $P$ in the polynomial ring ${\bf F}_2[t]$ over the finite field of characteristic two, the simplest of which is

$\displaystyle \sum_P \frac{1}{1+P} = 0$

(expanded in terms of Taylor series in $u = 1/t$).  Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.

Klaus Roth, who made fundamental contributions to analytic number theory, died this Tuesday, aged 90.

I never met or communicated with Roth personally, but was certainly influenced by his work; he wrote relatively few papers, but they tended to have outsized impact. For instance, he was one of the key people (together with Bombieri) to work on simplifying and generalising the large sieve, taking it from the technically formidable original formulation of Linnik and Rényi to the clean and general almost orthogonality principle that we have today (discussed for instance in these lecture notes of mine). The paper of Roth that had the most impact on my own personal work was his three-page paper proving what is now known as Roth’s theorem on arithmetic progressions:

Theorem 1 (Roth’s theorem on arithmetic progressions) Let ${A}$ be a set of natural numbers of positive upper density (thus ${\limsup_{N \rightarrow\infty} |A \cap \{1,\dots,N\}|/N > 0}$). Then ${A}$ contains infinitely many arithmetic progressions ${a,a+r,a+2r}$ of length three (with ${r}$ non-zero of course).

At the heart of Roth’s elegant argument was the following (surprising at the time) dichotomy: if ${A}$ had some moderately large density within some arithmetic progression ${P}$, either one could use Fourier-analytic methods to detect the presence of an arithmetic progression of length three inside ${A \cap P}$, or else one could locate a long subprogression ${P'}$ of ${P}$ on which ${A}$ had increased density. Iterating this dichotomy by an argument now known as the density increment argument, one eventually obtains Roth’s theorem, no matter which side of the dichotomy actually holds. This argument (and the many descendants of it), based on various “dichotomies between structure and randomness”, became essential in many other results of this type, most famously perhaps in Szemerédi’s proof of his celebrated theorem on arithmetic progressions that generalised Roth’s theorem to progressions of arbitrary length. More recently, my recent work on the Chowla and Elliott conjectures that was a crucial component of the solution of the Erdös discrepancy problem, relies on an entropy decrement argument which was directly inspired by the density increment argument of Roth.

The Erdös discrepancy problem also is connected with another well known theorem of Roth:

Theorem 2 (Roth’s discrepancy theorem for arithmetic progressions) Let ${f(1),\dots,f(n)}$ be a sequence in ${\{-1,+1\}}$. Then there exists an arithmetic progression ${a+r, a+2r, \dots, a+kr}$ in ${\{1,\dots,n\}}$ with ${r}$ positive such that

$\displaystyle |\sum_{j=1}^k f(a+jr)| \geq c n^{1/4}$

for an absolute constant ${c>0}$.

In fact, Roth proved a stronger estimate regarding mean square discrepancy, which I am not writing down here; as with the Roth theorem in arithmetic progressions, his proof was short and Fourier-analytic in nature (although non-Fourier-analytic proofs have since been found, for instance the semidefinite programming proof of Lovasz). The exponent ${1/4}$ is known to be sharp (a result of Matousek and Spencer).

As a particular corollary of the above theorem, for an infinite sequence ${f(1), f(2), \dots}$ of signs, the sums ${|\sum_{j=1}^k f(a+jr)|}$ are unbounded in ${a,r,k}$. The Erdös discrepancy problem asks whether the same statement holds when ${a}$ is restricted to be zero. (Roth also established discrepancy theorems for other sets, such as rectangles, which will not be discussed here.)

Finally, one has to mention Roth’s most famous result, cited for instance in his Fields medal citation:

Theorem 3 (Roth’s theorem on Diophantine approximation) Let ${\alpha}$ be an irrational algebraic number. Then for any ${\varepsilon > 0}$ there is a quantity ${c_{\alpha,\varepsilon}}$ such that

$\displaystyle |\alpha - \frac{a}{q}| > \frac{c_{\alpha,\varepsilon}}{q^{2+\varepsilon}}.$

From the Dirichlet approximation theorem (or from the theory of continued fractions) we know that the exponent ${2+\varepsilon}$ in the denominator cannot be reduced to ${2}$ or below. A classical and easy theorem of Liouville gives the claim with the exponent ${2+\varepsilon}$ replaced by the degree of the algebraic number ${\alpha}$; work of Thue and Siegel reduced this exponent, but Roth was the one who obtained the near-optimal result. An important point is that the constant ${c_{\alpha,\varepsilon}}$ is ineffective – it is a major open problem in Diophantine approximation to produce any bound significantly stronger than Liouville’s theorem with effective constants. This is because the proof of Roth’s theorem does not exclude any single rational ${a/q}$ from being close to ${\alpha}$, but instead very ingeniously shows that one cannot have two different rationals ${a/q}$, ${a'/q'}$ that are unusually close to ${\alpha}$, even when the denominators ${q,q'}$ are very different in size. (I refer to this sort of argument as a “dueling conspiracies” argument; they are strangely prevalent throughout analytic number theory.)

Chantal David, Andrew Granville, Emmanuel Kowalski, Phillipe Michel, Kannan Soundararajan, and I are running a program at MSRI in the Spring of 2017 (more precisely, from Jan 17, 2017 to May 26, 2017) in the area of analytic number theory, with the intention to bringing together many of the leading experts in all aspects of the subject and to present recent work on the many active areas of the subject (e.g. the distribution of the prime numbers, refinements of the circle method, a deeper understanding of the asymptotics of bounded multiplicative functions (and applications to Erdos discrepancy type problems!) and of the “pretentious” approach to analytic number theory, more “analysis-friendly” formulations of the theorems of Deligne and others involving trace functions over fields, and new subconvexity theorems for automorphic forms, to name a few).  Like any other semester MSRI program, there will be a number of workshops, seminars, and similar activities taking place while the members are in residence.  I’m personally looking forward to the program, which should be occurring in the midst of a particularly productive time for the subject.  Needless to say, I (and the rest of the organising committee) plan to be present for most of the program.

Applications for Postdoctoral Fellowships and Research Memberships for this program (and for other MSRI programs in this time period, namely the companion program in Harmonic Analysis and the Fall program in Geometric Group Theory, as well as the complementary program in all other areas of mathematics) remain open until Dec 1.  Applications are open to everyone, but require supporting documentation, such as a CV, statement of purpose, and letters of recommendation from other mathematicians; see the application page for more details.

Chantal David, Andrew Granville, Emmanuel Kowalski, Phillipe Michel, Kannan Soundararajan, and I are running a program at MSRI in the Spring of 2017 (more precisely, from Jan 17, 2017 to May 26, 2017) in the area of analytic number theory, with the intention to bringing together many of the leading experts in all aspects of the subject and to present recent work on the many active areas of the subject (the discussion on previous blog posts here have mostly focused on advances in the study of the distribution of the prime numbers, but there have been many other notable recent developments too, such as refinements of the circle method, a deeper understanding of the asymptotics of bounded multiplicative functions and of the “pretentious” approach to analytic number theory, more “analysis-friendly” formulations of the theorems of Deligne and others involving trace functions over fields, and new subconvexity theorems for automorphic forms, to name a few).  Like any other semester MSRI program, there will be a number of workshops, seminars, and similar activities taking place while the members are in residence.  I’m personally looking forward to the program, which should be occurring in the midst of a particularly productive time for the subject.  Needless to say, I (and the rest of the organising committee) plan to be present for most of the program.

Applications for Postdoctoral Fellowships, Research Memberships, and Research Professorships for this program (and for other MSRI programs in this time period, namely the companion program in Harmonic Analysis and the Fall program in Geometric Group Theory, as well as the complementary program in all other areas of mathematics) have just opened up today.  Applications are open to everyone (until they close on Dec 1), but require supporting documentation, such as a CV, statement of purpose, and letters of recommendation from other mathematicians; see the application page for more details.

In the winter quarter (starting January 5) I will be teaching a graduate topics course entitled “An introduction to analytic prime number theory“. As the name suggests, this is a course covering many of the analytic number theory techniques used to study the distribution of the prime numbers ${{\mathcal P} = \{2,3,5,7,11,\dots\}}$. I will list the topics I intend to cover in this course below the fold. As with my previous courses, I will place lecture notes online on my blog in advance of the physical lectures.

The type of results about primes that one aspires to prove here is well captured by Landau’s classical list of problems:

1. Even Goldbach conjecture: every even number ${N}$ greater than two is expressible as the sum of two primes.
2. Twin prime conjecture: there are infinitely many pairs ${n,n+2}$ which are simultaneously prime.
3. Legendre’s conjecture: for every natural number ${N}$, there is a prime between ${N^2}$ and ${(N+1)^2}$.
4. There are infinitely many primes of the form ${n^2+1}$.

All four of Landau’s problems remain open, but we have convincing heuristic evidence that they are all true, and in each of the four cases we have some highly non-trivial partial results, some of which will be covered in this course. We also now have some understanding of the barriers we are facing to fully resolving each of these problems, such as the parity problem; this will also be discussed in the course.

One of the main reasons that the prime numbers ${{\mathcal P}}$ are so difficult to deal with rigorously is that they have very little usable algebraic or geometric structure that we know how to exploit; for instance, we do not have any useful prime generating functions. One of course can create non-useful functions of this form, such as the ordered parameterisation ${n \mapsto p_n}$ that maps each natural number ${n}$ to the ${n^{th}}$ prime ${p_n}$, or one could invoke Matiyasevich’s theorem to produce a polynomial of many variables whose only positive values are prime, but these sorts of functions have no usable structure to exploit (for instance, they give no insight into any of the Landau problems listed above; see also Remark 2 below). The various primality tests in the literature, while useful for practical applications (e.g. cryptography) involving primes, have also proven to be of little utility for these sorts of problems; again, see Remark 2. In fact, in order to make plausible heuristic predictions about the primes, it is best to take almost the opposite point of view to the structured viewpoint, using as a starting point the belief that the primes exhibit strong pseudorandomness properties that are largely incompatible with the presence of rigid algebraic or geometric structure. We will discuss such heuristics later in this course.

It may be in the future that some usable structure to the primes (or related objects) will eventually be located (this is for instance one of the motivations in developing a rigorous theory of the “field with one element“, although this theory is far from being fully realised at present). For now, though, analytic and combinatorial methods have proven to be the most effective way forward, as they can often be used even in the near-complete absence of structure.

In this course, we will not discuss combinatorial approaches (such as the deployment of tools from additive combinatorics) in depth, but instead focus on the analytic methods. The basic principles of this approach can be summarised as follows:

1. Rather than try to isolate individual primes ${p}$ in ${{\mathcal P}}$, one works with the set of primes ${{\mathcal P}}$ in aggregate, focusing in particular on asymptotic statistics of this set. For instance, rather than try to find a single pair ${n,n+2}$ of twin primes, one can focus instead on the count ${|\{ n \leq x: n,n+2 \in {\mathcal P} \}|}$ of twin primes up to some threshold ${x}$. Similarly, one can focus on counts such as ${|\{ n \leq N: n, N-n \in {\mathcal P} \}|}$, ${|\{ p \in {\mathcal P}: N^2 < p < (N+1)^2 \}|}$, or ${|\{ n \leq x: n^2 + 1 \in {\mathcal P} \}|}$, which are the natural counts associated to the other three Landau problems. In all four of Landau’s problems, the basic task is now to obtain a non-trivial lower bounds on these counts.
2. If one wishes to proceed analytically rather than combinatorially, one should convert all these counts into sums, using the fundamental identity

$\displaystyle |A| = \sum_n 1_A(n),$

(or variants thereof) for the cardinality ${|A|}$ of subsets ${A}$ of the natural numbers ${{\bf N}}$, where ${1_A}$ is the indicator function of ${A}$ (and ${n}$ ranges over ${{\bf N}}$). Thus we are now interested in estimating (and particularly in lower bounding) sums such as

$\displaystyle \sum_{n \leq N} 1_{{\mathcal P}}(n) 1_{{\mathcal P}}(N-n),$

$\displaystyle \sum_{n \leq x} 1_{{\mathcal P}}(n) 1_{{\mathcal P}}(n+2),$

$\displaystyle \sum_{N^2 < n < (N+1)^2} 1_{{\mathcal P}}(n),$

or

$\displaystyle \sum_{n \leq x} 1_{{\mathcal P}}(n^2+1).$

3. Once one expresses number-theoretic problems in this fashion, we are naturally led to the more general question of how to accurately estimate (or, less ambitiously, to lower bound or upper bound) sums such as

$\displaystyle \sum_n f(n)$

or more generally bilinear or multilinear sums such as

$\displaystyle \sum_n \sum_m f(n,m)$

or

$\displaystyle \sum_{n_1,\dots,n_k} f(n_1,\dots,n_k)$

for various functions ${f}$ of arithmetic interest. (Importantly, one should also generalise to include integrals as well as sums, particularly contour integrals or integrals over the unit circle or real line, but we postpone discussion of these generalisations to later in the course.) Indeed, a huge portion of modern analytic number theory is devoted to precisely this sort of question. In many cases, we can predict an expected main term for such sums, and then the task is to control the error term between the true sum and its expected main term. It is often convenient to normalise the expected main term to be zero or negligible (e.g. by subtracting a suitable constant from ${f}$), so that one is now trying to show that a sum of signed real numbers (or perhaps complex numbers) is small. In other words, the question becomes one of rigorously establishing a significant amount of cancellation in one’s sums (also referred to as a gain or savings over a benchmark “trivial bound”). Or to phrase it negatively, the task is to rigorously prevent a conspiracy of non-cancellation, caused for instance by two factors in the summand ${f(n)}$ exhibiting an unexpectedly large correlation with each other.

4. It is often difficult to discern cancellation (or to prevent conspiracy) directly for a given sum (such as ${\sum_n f(n)}$) of interest. However, analytic number theory has developed a large number of techniques to relate one sum to another, and then the strategy is to keep transforming the sum into more and more analytically tractable expressions, until one arrives at a sum for which cancellation can be directly exhibited. (Note though that there is often a short-term tradeoff between analytic tractability and algebraic simplicity; in a typical analytic number theory argument, the sums will get expanded and decomposed into many quite messy-looking sub-sums, until at some point one applies some crude estimation to replace these messy sub-sums by tractable ones again.) There are many transformations available, ranging such basic tools as the triangle inequality, pointwise domination, or the Cauchy-Schwarz inequality to key identities such as multiplicative number theory identities (such as the Vaughan identity and the Heath-Brown identity), Fourier-analytic identities (e.g. Fourier inversion, Poisson summation, or more advanced trace formulae), or complex analytic identities (e.g. the residue theorem, Perron’s formula, or Jensen’s formula). The sheer range of transformations available can be intimidating at first; there is no shortage of transformations and identities in this subject, and if one applies them randomly then one will typically just transform a difficult sum into an even more difficult and intractable expression. However, one can make progress if one is guided by the strategy of isolating and enhancing a desired cancellation (or conspiracy) to the point where it can be easily established (or dispelled), or alternatively to reach the point where no deep cancellation is needed for the application at hand (or equivalently, that no deep conspiracy can disrupt the application).
5. One particularly powerful technique (albeit one which, ironically, can be highly “ineffective” in a certain technical sense to be discussed later) is to use one potential conspiracy to defeat another, a technique I refer to as the “dueling conspiracies” method. This technique may be unable to prevent a single strong conspiracy, but it can sometimes be used to prevent two or more such conspiracies from occurring, which is particularly useful if conspiracies come in pairs (e.g. through complex conjugation symmetry, or a functional equation). A related (but more “effective”) strategy is to try to “disperse” a single conspiracy into several distinct conspiracies, which can then be used to defeat each other.

As stated before, the above strategy has not been able to establish any of the four Landau problems as stated. However, they can come close to such problems (and we now have some understanding as to why these problems remain out of reach of current methods). For instance, by using these techniques (and a lot of additional effort) one can obtain the following sample partial results in the Landau problems:

1. Chen’s theorem: every sufficiently large even number ${N}$ is expressible as the sum of a prime and an almost prime (the product of at most two primes). The proof proceeds by finding a nontrivial lower bound on ${\sum_{n \leq N} 1_{\mathcal P}(n) 1_{{\mathcal E}_2}(N-n)}$, where ${{\mathcal E}_2}$ is the set of almost primes.
2. Zhang’s theorem: There exist infinitely many pairs ${p_n, p_{n+1}}$ of consecutive primes with ${p_{n+1} - p_n \leq 7 \times 10^7}$. The proof proceeds by giving a non-negative lower bound on the quantity ${\sum_{x \leq n \leq 2x} (\sum_{i=1}^k 1_{\mathcal P}(n+h_i) - 1)}$ for large ${x}$ and certain distinct integers ${h_1,\dots,h_k}$ between ${0}$ and ${7 \times 10^7}$. (The bound ${7 \times 10^7}$ has since been lowered to ${246}$.)
3. The Baker-Harman-Pintz theorem: for sufficiently large ${x}$, there is a prime between ${x}$ and ${x + x^{0.525}}$. Proven by finding a nontrivial lower bound on ${\sum_{x \leq n \leq x+x^{0.525}} 1_{\mathcal P}(n)}$.
4. The Friedlander-Iwaniec theorem: There are infinitely many primes of the form ${n^2+m^4}$. Proven by finding a nontrivial lower bound on ${\sum_{n,m: n^2+m^4 \leq x} 1_{{\mathcal P}}(n^2+m^4)}$.

We will discuss (simpler versions of) several of these results in this course.

Of course, for the above general strategy to have any chance of succeeding, one must at some point use some information about the set ${{\mathcal P}}$ of primes. As stated previously, usefully structured parametric descriptions of ${{\mathcal P}}$ do not appear to be available. However, we do have two other fundamental and useful ways to describe ${{\mathcal P}}$:

1. (Sieve theory description) The primes ${{\mathcal P}}$ consist of those numbers greater than one, that are not divisible by any smaller prime.
2. (Multiplicative number theory description) The primes ${{\mathcal P}}$ are the multiplicative generators of the natural numbers ${{\bf N}}$: every natural number is uniquely factorisable (up to permutation) into the product of primes (the fundamental theorem of arithmetic).

The sieve-theoretic description and its variants lead one to a good understanding of the almost primes, which turn out to be excellent tools for controlling the primes themselves, although there are known limitations as to how much information on the primes one can extract from sieve-theoretic methods alone, which we will discuss later in this course. The multiplicative number theory methods lead one (after some complex or Fourier analysis) to the Riemann zeta function (and other L-functions, particularly the Dirichlet L-functions), with the distribution of zeroes (and poles) of these functions playing a particularly decisive role in the multiplicative methods.

Many of our strongest results in analytic prime number theory are ultimately obtained by incorporating some combination of the above two fundamental descriptions of ${{\mathcal P}}$ (or variants thereof) into the general strategy described above. In contrast, more advanced descriptions of ${{\mathcal P}}$, such as those coming from the various primality tests available, have (until now, at least) been surprisingly ineffective in practice for attacking problems such as Landau’s problems. One reason for this is that such tests generally involve operations such as exponentiation ${a \mapsto a^n}$ or the factorial function ${n \mapsto n!}$, which grow too quickly to be amenable to the analytic techniques discussed above.

To give a simple illustration of these two basic approaches to the primes, let us first give two variants of the usual proof of Euclid’s theorem:

Theorem 1 (Euclid’s theorem) There are infinitely many primes.

Proof: (Multiplicative number theory proof) Suppose for contradiction that there were only finitely many primes ${p_1,\dots,p_n}$. Then, by the fundamental theorem of arithmetic, every natural number is expressible as the product of the primes ${p_1,\dots,p_n}$. But the natural number ${p_1 \dots p_n + 1}$ is larger than one, but not divisible by any of the primes ${p_1,\dots,p_n}$, a contradiction.

(Sieve-theoretic proof) Suppose for contradiction that there were only finitely many primes ${p_1,\dots,p_n}$. Then, by the Chinese remainder theorem, the set of natural numbers ${A}$ that is not divisible by any of the ${p_1,\dots,p_n}$ has density ${\prod_{i=1}^n (1-\frac{1}{p_i})}$, that is to say

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} | A \cap \{1,\dots,N\} | = \prod_{i=1}^n (1-\frac{1}{p_i}).$

In particular, ${A}$ has positive density and thus contains an element larger than ${1}$. But the least such element is one further prime in addition to ${p_1,\dots,p_n}$, a contradiction. $\Box$

Remark 1 One can also phrase the proof of Euclid’s theorem in a fashion that largely avoids the use of contradiction; see this previous blog post for more discussion.

Both proofs in fact extend to give a stronger result:

Theorem 2 (Euler’s theorem) The sum ${\sum_{p \in {\mathcal P}} \frac{1}{p}}$ is divergent.

Proof: (Multiplicative number theory proof) By the fundamental theorem of arithmetic, every natural number is expressible uniquely as the product ${p_1^{a_1} \dots p_n^{a_n}}$ of primes in increasing order. In particular, we have the identity

$\displaystyle \sum_{n=1}^\infty \frac{1}{n} = \prod_{p \in {\mathcal P}} ( 1 + \frac{1}{p} + \frac{1}{p^2} + \dots )$

(both sides make sense in ${[0,+\infty]}$ as everything is unsigned). Since the left-hand side is divergent, the right-hand side is as well. But

$\displaystyle ( 1 + \frac{1}{p} + \frac{1}{p^2} + \dots ) = \exp( \frac{1}{p} + O( \frac{1}{p^2} ) )$

and ${\sum_{p \in {\mathcal P}} \frac{1}{p^2}\leq \sum_{n=1}^\infty \frac{1}{n^2} < \infty}$, so ${\sum_{p \in {\mathcal P}} \frac{1}{p}}$ must be divergent.

(Sieve-theoretic proof) Suppose for contradiction that the sum ${\sum_{p \in {\mathcal P}} \frac{1}{p}}$ is convergent. For each natural number ${k}$, let ${A_k}$ be the set of natural numbers not divisible by the first ${k}$ primes ${p_1,\dots,p_k}$, and let ${A}$ be the set of numbers not divisible by any prime in ${{\mathcal P}}$. As in the previous proof, each ${A_k}$ has density ${\prod_{i=1}^k (1-\frac{1}{p_i})}$. Also, since ${\{1,\dots,N\}}$ contains at most ${\frac{N}{p}}$ multiples of ${p}$, we have from the union bound that

$\displaystyle | A \cap \{1,\dots,N \}| = |A_k \cap \{1,\dots,N\}| - O( N \sum_{i > k} \frac{1}{p_i} ).$

Since ${\sum_{i=1}^\infty \frac{1}{p_i}}$ is assumed to be convergent, we conclude that the density of ${A_k}$ converges to the density of ${A}$; thus ${A}$ has density ${\prod_{i=1}^\infty (1-\frac{1}{p_i})}$, which is non-zero by the hypothesis that ${\sum_{i=1}^\infty \frac{1}{p_i}}$ converges. On the other hand, since the primes are the only numbers greater than one not divisible by smaller primes, ${A}$ is just ${\{1\}}$, which has density zero, giving the desired contradiction. $\Box$

Remark 2 We have seen how easy it is to prove Euler’s theorem by analytic methods. In contrast, there does not seem to be any known proof of this theorem that proceeds by using any sort of prime-generating formula or a primality test, which is further evidence that such tools are not the most effective way to make progress on problems such as Landau’s problems. (But the weaker theorem of Euclid, Theorem 1, can sometimes be proven by such devices.)

The two proofs of Theorem 2 given above are essentially the same proof, as is hinted at by the geometric series identity

$\displaystyle 1 + \frac{1}{p} + \frac{1}{p^2} + \dots = (1 - \frac{1}{p})^{-1}.$

One can also see the Riemann zeta function begin to make an appearance in both proofs. Once one goes beyond Euler’s theorem, though, the sieve-theoretic and multiplicative methods begin to diverge significantly. On one hand, sieve theory can still handle to some extent sets such as twin primes, despite the lack of multiplicative structure (one simply has to sieve out two residue classes per prime, rather than one); on the other, multiplicative number theory can attain results such as the prime number theorem for which purely sieve theoretic techniques have not been able to establish. The deepest results in analytic number theory will typically require a combination of both sieve-theoretic methods and multiplicative methods in conjunction with the many transforms discussed earlier (and, in many cases, additional inputs from other fields of mathematics such as arithmetic geometry, ergodic theory, or additive combinatorics).

[This guest post is authored by Matilde Lalin, an Associate Professor in the Département de mathématiques et de statistique at the Université de Montréal.  I have lightly edited the text, mostly by adding some HTML formatting. -T.]

Mathematicians (and likely other academics!) with small children face some unique challenges when traveling to conferences and workshops. The goal of this post is to reflect on these, and to start a constructive discussion what institutions and event organizers could do to improve the experiences of such participants.

The first necessary step is to recognize that different families have different needs. While it is hard to completely address everybody’s needs, there are some general measures that have a good chance to help most of the people traveling with young children. In this post, I will mostly focus on nursing mothers with infants ($\leq 24$ months old) because that is my personal experience. Many of the suggestions will apply to other cases such as non-nursing babies, children of single parents, children of couples of mathematicians who are interested in attending the same conference, etc..

The mother of a nursing infant that wishes to attend a conference has three options:

1. Bring the infant and a relative/friend to help caring for the infant. The main challenge in this case is to fund the trip expenses of the relative. This involves trip costs, lodging, and food. The family may need a hotel room with some special amenities such as crib, fridge, microwave, etc. Location is also important, with easy access to facilities such as a grocery store, pharmacy, etc. The mother will need to take regular breaks from the conference in order to nurse the baby (this could be as often as every three hours or so). Depending on personal preferences, she may need to nurse privately. It is convenient, thus, to make a private room available, located as close to the conference venue as possible. The relative may need to have a place to stay with the baby near the conference such as a playground or a room with toys, particularly if the hotel room is far.
2. Bring the infant and hire someone local (a nanny) to help caring for the infant. The main challenges in this case are two: finding the caregiver and paying for such services. Finding a caregiver in a place where one does not live is hard, as it is difficult to conduct interviews or get references. There are agencies that can do this for a (quite expensive) fee: they will find a professional caregiver with background checks, CPR certification, many references, etc. It may be worth it, though, as professional caregivers tend to provide high-quality services and peace of mind is priceless for the mother mathematician attending a conference. As in the previous case, the mother may have particular needs regarding the hotel room, location, and all the other facilities mentioned for Option 1.
3. Travel without the infant and pump milk regularly. This can be very challenging for the mother, the baby, and the person that stays behind taking care of the baby, but the costs of this arrangement are much lower than in Option 1 or 2 (I am ignoring the possibility that the family needs to hire help at home, which is necessary in some cases). A nursing mother away from her baby has no option but to pump her milk to prevent her from pain and serious health complications. This mother may have to pump milk very often. Pumping is less efficient than nursing, so she will be gone for longer in each break or she will have more breaks compared to a mother that travels with her baby. For pumping, people need a room which should ideally be private, with a sink, and located as close to the conference venue as possible. It is often impossible for these three conditions to be met at the same time, so different mothers give priority to different features. Some people pump milk in washrooms, to have easy access to water. Other people might prefer to pump in a more comfortable setting, such as an office, and go to the washroom to wash the breast pump accessories after. If the mother expects that the baby will drink breastmilk while she is away, then she will also have to pump milk in advance of her trip. This requires some careful planning.Many pumping mothers try to store the pumped milk and bring it back home. In this case the mother needs a hotel room with a fridge which (ideally, but hard to find) has a freezer. In a perfect world there would also be a fridge in the place where she pumps/where the conference is held.

It is important to keep in mind that each option has its own set of challenges (even when expenses and facilities are all covered) and that different families may be restricted in their choice of options for a variety of reasons. It is therefore important that all these three options be facilitated.

As for the effect these choices have on the conference experience for the mother, Option 1 means that she has to balance her time between the conference and spending time with her relative/friend. This pressure disappears when we consider Option 2, so this option may lead to more participation in the conferences activities. In Option 3, the mother is in principle free to participate in all the conference activities, but the frequent breaks may limit the type of activity. A mother may choose different options depending on the nature of the conference.

I want to stress, for the three options, that having to make choices about what to miss in the conference is very hard. While talks are important, so are the opportunities to meet people and discuss mathematics that happen during breaks and social events. It is very difficult to balance all of this. This is particularly difficult for the pumping mother in Option 3: because she travels without her baby, she is not perceived to be a in special situation or in need of accommodation. However, this mother is probably choosing between going to the last lecture in the morning or having lunch alone, because if she goes to pump right after the last lecture, by the time she is back, everybody has left for lunch.

Here is the Hall of Fame for those organizations that are already supporting nursing mothers’ travels in mathematics:

• The Natural Sciences and Engineering Research Council of Canada (NSERC) (search for “child care”) allows to reimburse the costs of child care with Option 2 out of the mother’s grants. They will also reimburse the travel expenses of a relative with Option 1 up to the amount that would cost to hire a local caregiver.
• The ENFANT/ELEFANT conference (co-organized by Lillian Pierce and Damaris Schindler) provided a good model to follow regarding accommodation for parents with children during conferences that included funding for covering the travel costs of accompanying caretakers (the funding was provided by the Deutsche Forschungsgemeinschaft, and lactation rooms and play rooms near the conference venue (the facilities were provided by the Hausdorff Center for Mathematics).Additional information (where to go with kids, etc) was provided on site by the organizers and was made available to all participants all the time, by means of a display board that was left standing during the whole week of the conference.
• The American Institute of Mathematics (AIM) reimburses up to 500 dollars on childcare for visitors and they have some online resources that assist in finding childcare and nannies.

[UPDATED] Added a few more things to the Hall of Fame

In closing, here is a (possibly incomplete) list of resources that institutes, funding agencies, and conferences could consider providing for nursing mother mathematicians:

1. Funding (for cost associated to child care either professional or by an accompanying relative).
2. List of childcare resources (nannies, nanny agencies, drop-in childcare centre, etc).
3. Nursing rooms and playrooms near the conference venue. Nearby fridge.
4. Breaks of at least 20 minutes every 2-3 hours.
5. Information about transportation with infants. More specific, taxi and/or shuttle companies that provide infant car seats. Information regarding the law on infant seats in taxis and other public transportation.
6. Accessibility for strollers.
7. [UPDATED] A nearby playground location. (comment from Peter).

I also find it important that these resources be listed publicly in the institute/conference website. This serves a double purpose: first, it helps those in need of the resources to access them easily, and second, it contributes to make these accommodations normal, setting a good model for future events, and inspiring organizers of future events.

Finally, I am pretty sure that the options and solutions I described do not cover all cases. I would like to finish this note by inviting readers to make suggestions, share experiences, and/or pose questions about this topic.

I’m encountering a sporadic bug over the past few months with the way WordPress renders or displays its LaTeX images on this blog (and occasionally on other WordPress blogs).  On most computers, it seems to work fine, but on some computers, the sizes of images are occasionally way off, leading to extremely distorted and fairly unreadable versions of the images appearing in blog posts and comments.  A sample screenshot (with accompanying HTML source), supplied to me by a reader, can be found here (in which an image whose dimensions should be 321 x 59 are instead being displayed as 552 x 20).  Is anyone else encountering this issue?  The problem sometimes can be resolved by refreshing the page, but not always, so it is a bit unclear where the problem is coming from and how one might mitigate it.  (If nothing else, I can add it to the bug collection post, once it can be reliably replicated.)

It’s time to (somewhat belatedly) roll over the previous thread on writing the first paper from the Polymath8 project, as this thread is overflowing with comments.  We are getting near the end of writing this large (173 pages!) paper, establishing a bound of 4,680 on the gap between primes, with only a few sections left to thoroughly proofread (and the last section should probably be removed, with appropriate changes elsewhere, in view of the more recent progress by Maynard).  As before, one can access the working copy of the paper at this subdirectory, as well as the rest of the directory, and the plan is to submit the paper to Algebra and Number theory (and the arXiv) once there is consensus to do so.  Even before this paper was submitted, it already has had some impact; Andrew Granville’s exposition of the bounded gaps between primes story for the Bulletin of the AMS follows several of the Polymath8 arguments in deriving the result.

After this paper is done, there is interest in continuing onwards with other Polymath8 – related topics, and perhaps it is time to start planning for them.  First of all, we have an invitation from  the Newsletter of the European Mathematical Society to discuss our experiences and impressions with the project.  I think it would be interesting to collect some impressions or thoughts (both positive and negative)  from people who were highly active in the research and/or writing aspects of the project, as well as from more casual participants who were following the progress more quietly.  This project seemed to attract a bit more attention than most other polymath projects (with the possible exception of the very first project, Polymath1).  I think there are several reasons for this; the project builds upon a recent breakthrough (Zhang’s paper) that attracted an impressive amount of attention and publicity; the objective is quite easy to describe, when compared against other mathematical research objectives; and one could summarise the current state of progress by a single natural number H, which implied by infinite descent that the project was guaranteed to terminate at some point, but also made it possible to set up a “scoreboard” that could be quickly and easily updated.  From the research side, another appealing feature of the project was that – in the early stages of the project, at least – it was quite easy to grab a new world record by means of making a small observation, which made it fit very well with the polymath spirit (in which the emphasis is on lots of small contributions by many people, rather than a few big contributions by a small number of people).  Indeed, when the project first arose spontaneously as a blog post of Scott Morrrison over at the Secret Blogging Seminar, I was initially hesitant to get involved, but soon found the “game” of shaving a few thousands or so off of $H$ to be rather fun and addictive, and with a much greater sense of instant gratification than traditional research projects, which often take months before a satisfactory conclusion is reached.  Anyway, I would welcome other thoughts or impressions on the projects in the comments below (I think that the pace of comments regarding proofreading of the paper has slowed down enough that this post can accommodate both types of comments comfortably.)

Then of course there is the “Polymath 8b” project in which we build upon the recent breakthroughs of James Maynard, which have simplified the route to bounded gaps between primes considerably, bypassing the need for any Elliott-Halberstam type distribution results beyond the Bombieri-Vinogradov theorem.  James has kindly shown me an advance copy of the preprint, which should be available on the arXiv in a matter of days; it looks like he has made a modest improvement to the previously announced results, improving $k_0$ a bit to 105 (which then improves H to the nice round number of 600).  He also has a companion result on bounding gaps $p_{n+m}-p_n$ between non-consecutive primes for any $m$ (not just $m=1$), with a bound of the shape $H_m := \lim \inf_{n \to \infty} p_{n+m}-p_n \ll m^3 e^{4m}$, which is in fact the first time that the finiteness of this limit inferior has been demonstrated.  I plan to discuss these results (from a slightly different perspective than Maynard) in a subsequent blog post kicking off the Polymath8b project, once Maynard’s paper has been uploaded.  It should be possible to shave the value of $H = H_1$ down further (or to get better bounds for $H_m$ for larger $m$), both unconditionally and under assumptions such as the Elliott-Halberstam conjecture, either by performing more numerical or theoretical optimisation on the variational problem Maynard is faced with, and also by using the improved distributional estimates provided by our existing paper; again, I plan to discuss these issues in a subsequent post. ( James, by the way, has expressed interest in participating in this project, which should be very helpful.)

Once again it is time to roll over the previous discussion thread, which has become rather full with comments.  The paper is nearly finished (see also the working copy at this subdirectory, as well as the rest of the directory), but several people are carefully proofreading various sections of the paper.  Once all the people doing so have signed off on it, I think we will be ready to submit (there appears to be no objection to the plan to submit to Algebra and Number Theory).

Another thing to discuss is an invitation to Polymath8 to write a feature article (up to 8000 words or 15 pages) for the Newsletter of the European Mathematical Society on our experiences with this project.  It is perhaps premature to actually start writing this article before the main research paper is finalised, but we can at least plan how to write such an article.  One suggestion, proposed by Emmanuel, is to have individual participants each contribute a brief account of their interaction with the project, which we would compile together with some additional text summarising the project as a whole (and maybe some speculation for any lessons we can apply here for future polymath projects).   Certainly I plan to have a separate blog post collecting feedback on this project once the main writing is done.

The main purpose of this post is to roll over the discussion from the previous Polymath8 thread, which has become rather full with comments.  We are still writing the paper, but it appears to have stabilised in a near-final form (source files available here); the main remaining tasks are proofreading, checking the mathematics, and polishing the exposition.  We also have a tentative consensus to submit the paper to Algebra and Number Theory when the proofreading is all complete.

The paper is quite large now (164 pages!) but it is fortunately rather modular, and thus hopefully somewhat readable (particularly regarding the first half of the paper, which does not  need any of the advanced exponential sum estimates).  The size should not be a major issue for the journal, so I would not seek to artificially shorten the paper at the expense of readability or content.