The “epsilon-delta” nature of analysis can be daunting and unintuitive to students, as the heavy reliance on inequalities rather than equalities. But it occurred to me recently that one might be able to leverage the intuition one already has from “deals” – of the type one often sees advertised by corporations – to get at least some informal understanding of these concepts.

Take for instance the concept of an upper bound ${X \leq A}$ or a lower bound ${X \geq B}$ on some quantity ${X}$. From an economic perspective, one could think of the upper bound as an assertion that ${X}$ can be “bought” for ${A}$ units of currency, and the lower bound can similarly be viewed as an assertion that ${X}$ can be “sold” for ${B}$ units of currency. Thus for instance, a system of inequalities and equations like

$\displaystyle 2 \leq Y \leq 5$

$\displaystyle X+Y \leq 7$

$\displaystyle X+Y+Z = 10$

$\displaystyle Y+Z \leq 6$

could be viewed as analogous to a currency rate exchange board, of the type one sees for instance in airports:

 Currency We buy at We sell at ${Y}$ ${2}$ ${5}$ ${X+Y}$ – ${7}$ ${X+Y+Z}$ ${10}$ ${10}$ ${Y+Z}$ – ${6}$

Someone with an eye for spotting “deals” might now realize that one can actually buy ${Y}$ for ${3}$ units of currency rather than ${5}$, by purchasing one copy each of ${X+Y}$ and ${Y+Z}$ for ${7+6=13}$ units of currency, then selling off ${X+Y+Z}$ to recover ${10}$ units of currency back. In more traditional mathematical language, one can improve the upper bound ${Y \leq 5}$ to ${Y \leq 3}$ by taking the appropriate linear combination of the inequalities ${X+Y \leq 7}$, ${Y+Z \leq 6}$, and ${X+Y+Z=10}$. More generally, this way of thinking is useful when faced with a linear programming situation (and of course linear programming is a key foundation for operations research), although this analogy begins to break down when one wants to use inequalities in a more non-linear fashion.

Asymptotic estimates such as ${X = O(Y)}$ (also often written ${X \lesssim Y}$ or ${X \ll Y}$) can be viewed as some sort of liquid market in which ${Y}$ can be used to purchase ${X}$, though depending on market rates, one may need a large number of units of ${Y}$ in order to buy a single unit of ${X}$. An asymptotic estimate like ${X=o(Y)}$ represents an economic situation in which ${Y}$ is so much more highly desired than ${X}$ that, if one is a patient enough haggler, one can eventually convince someone to give up a unit of ${X}$ for even just a tiny amount of ${Y}$.

When it comes to the basic analysis concepts of convergence and continuity, one can similarly view these concepts as various economic transactions involving the buying and selling of accuracy. One could for instance imagine the following hypothetical range of products in which one would need to spend more money to obtain higher accuracy to measure weight in grams:

 Object Accuracy Price Low-end kitchen scale ${\pm 1}$ gram ${\ 5}$ High-end bathroom scale ${\pm 0.1}$ grams ${\ 15}$ Low-end lab scale ${\pm 0.01}$ grams ${\ 50}$ High-end lab scale ${\pm 0.001}$ grams ${\ 250}$

The concept of convergence ${x_n \rightarrow x}$ of a sequence ${x_1,x_2,x_3,\dots}$ to a limit ${x}$ could then be viewed as somewhat analogous to a rewards program, of the type offered for instance by airlines, in which various tiers of perks are offered when one hits a certain level of “currency” (e.g., frequent flyer miles). For instance, the convergence of the sequence ${x_n := 2 + \frac{1}{\sqrt{n}}}$ to its limit ${x := 2}$ offers the following accuracy “perks” depending on one’s level ${n}$ in the sequence:

 Status Accuracy benefit Eligibility Basic status ${|x_n - x| \leq 1}$ ${n \geq 1}$ Bronze status ${|x_n - x| \leq 0.1}$ ${n \geq 10^2}$ Silver status ${|x_n - x| \leq 0.01}$ ${n \geq 10^4}$ Gold status ${|x_n - x| \leq 0.001}$ ${n \geq 10^6}$ ${\dots}$ ${\dots}$ ${\dots}$

With this conceptual model, convergence means that any status level of accuracy can be unlocked if one’s number ${n}$ of “points earned” is high enough.

In a similar vein, continuity becomes analogous to a conversion program, in which accuracy benefits from one company can be traded in for new accuracy benefits in another company. For instance, the continuity of the function ${f(x) = 2 + \sqrt{x}}$ at the point ${x_0=0}$ can be viewed in terms of the following conversion chart:

 Accuracy benefit of ${x}$ to trade in Accuracy benefit of ${f(x)}$ obtained ${|x - x_0| \leq 1}$ ${|f(x) - f(x_0)| \leq 1}$ ${|x - x_0| \leq 0.01}$ ${|f(x) - f(x_0)| \leq 0.1}$ ${|x - x_0| \leq 0.0001}$ ${|f(x) - f(x_0)| \leq 0.01}$ ${\dots}$ ${\dots}$

Again, the point is that one can purchase any desired level of accuracy of ${f(x)}$ provided one trades in a suitably high level of accuracy of ${x}$.

At present, the above conversion chart is only available at the single location ${x_0}$. The concept of uniform continuity can then be viewed as an advertising copy that “offer prices are valid in all store locations”. In a similar vein, the concept of equicontinuity for a class ${{\mathcal F}}$ of functions is a guarantee that “offer applies to all functions ${f}$ in the class ${{\mathcal F}}$, without any price discrimination. The combined notion of uniform equicontinuity is then of course the claim that the offer is valid in all locations and for all functions.

In a similar vein, differentiability can be viewed as a deal in which one can trade in accuracy of the input for approximately linear behavior of the output; to oversimplify slightly, smoothness can similarly be viewed as a deal in which one trades in accuracy of the input for high-accuracy polynomial approximability of the output. Measurability of a set or function can be viewed as a deal in which one trades in a level of resolution for an accurate approximation of that set or function at the given resolution. And so forth.

Perhaps readers can propose some other examples of mathematical concepts being re-interpreted as some sort of economic transaction?

The National Academies of Science, Engineering, and Mathematics are hosting a virtual workshop on the topic of “AI to Assist Mathematical Reasoning” from June 12-14. The tentative program can be found here. I am one of the members of the organizing committee for this workshop, together with Petros Koumoutsakos, Jordan Ellenberg, Melvin Greer, Brendan Hassett, Yann A. LeCun, Heather Macbeth, Talia Ringer, Kavitha Srinivas, and Michelle Schwalbe. There is some thematic overlap (and a few speakers in common) with the recent IPAM program on machine assisted proof, though with more of a focus on the current and projected technical capabilities of machine learning algorithms for mathematics. Registration for the event is currently open at the web page for the workshop.

As part of my duties on the Presidential Council of Advisors on Science and Technology (PCAST), I am co-chairing (with Laura Greene) a working group studying the impacts of generative artificial intelligence technology (which includes popular text-based large language models such as ChatGPT or diffusion model image generators such as DALL-E 2 or Midjourney, as well as models for scientific applications such as protein design or weather prediction), both in science and in society more broadly. To this end, we will have public sessions on these topics during our PCAST meeting next week on Friday, May 19, with presentations by the following speakers, followed by an extensive Q&A session:

The event will be livestreamed on the PCAST meeting page. I am personally very much looking forward to these sessions, as I believe they will be of broad public interest.

In parallel to this, our working group is also soliciting public input for submissions from the public on how to identify and promote the beneficial deployment of generative AI, and on how best to mitigate risks. Our initial focus is on the challenging topic of how to detect, counteract, and mitigate AI-generated disinformation and “deepfakes”, without sacrificing the freedom of speech and public engagement with elected officials that is needed for a healthy democracy to function; in the future we may also issue further requests centered around other aspects of generative AI. Further details of our request, and how to prepare a submission, can be found at this link.

We also encourage submissions to some additional requests for input on AI-related topics by other agencies:

1. The Office of Science Technology and Policy (OSTP) Request for Information on how automated tools are being used to surveil, monitor, and manage workers.
2. The National Telecommunications and Information Administration (NTIA) request for comment on AI accountability policy.

Readers who wish to know more about existing or ongoing federal AI policy efforts may also be interested in the following resources:

• The White House Blueprint for an AI Bill of Rights lays out core aspirational principles to guide the responsible design and deployment of AI technologies.
• The National Institute of Standards and Technology (NIST) released the AI Risk Management Framework to help organizations and individuals characterize and manage the potential risks of AI technologies.
• Congress created the National Security Commission on AI, which studied opportunities and risks ahead and the importance of guiding the development of AI in accordance with American values around democracy and civil liberties.
• The National Artificial Intelligence Initiative was launched to ensure U.S. leadership in the responsible development and deployment of trustworthy AI and support coordination of U.S. research, development, and demonstration of AI technologies across the Federal government.
• In January 2023, the Congressionally mandated National AI Research Resource (NAIRR) Task Force released an implementation plan for providing computational, data, testbed, and software resources to AI researchers affiliated with U.S organizations.

The Elias M. Stein Prize for New Perspectives in Analysis is awarded for the development of groundbreaking methods in analysis which demonstrate promise to revitalize established areas or create new opportunities for mathematical discovery. The current prize amount is US\$5,000 and the prize is awarded every three years for work published in the preceding six years.

This prize was endowed in 2022 by students, colleagues, and friends of Elias M. Stein (my former advisor) to honor his remarkable legacy in the area of mathematical analysis. Stein, who passed away in 2018, is remembered for identifying many deep principles and methods which transcend their original context, and for opening entirely new areas of research which captivated the attention and imagination of generations of analysts. This prize seeks to recognize mathematicians at any career stage who, like Stein, have found exciting new avenues for mathematical exploration in subjects old or new or made deep insights which demonstrate promise to reshape thinking across areas.

This will be the inaugural year for the prize, and I have agreed to serve on the prize committee. We welcome nominations for the prize, which will be accepted until June 30, 2023, and are seeking a strong and diverse pool of nominees. Nominations (submitted at this link) should include a letter of nomination and a brief citation to be used in the event that the nomination is successful. Alternatively, if you are aware of a strong potential candidate but are not able to provide the nomination yourself, we welcome your suggestion (by private email) along with — if possible — your suggestions of possible nominators.

Asgar Jamneshan, Or Shalom, and myself have just uploaded to the arXiv our preprints “A Host–Kra ${{\bf F}^\omega_2}$-system of order 5 that is not Abramov of order 5, and non-measurability of the inverse theorem for the ${U^6({\bf F}^n_2)}$ norm” and “The structure of totally disconnected Host–Kra–Ziegler factors, and the inverse theorem for the ${U^k}$ Gowers uniformity norms on finite abelian groups of bounded torsion“. These two papers are both concerned with advancing the inverse theory for the Gowers norms and Gowers-Host-Kra seminorms; the first paper provides a counterexample in this theory (in particular disproving a conjecture of Bergelson, Ziegler and myself), and the second paper gives new positive results in the case when the underlying group is bounded torsion, or the ergodic system is totally disconnected. I discuss the two papers more below the fold.

The International Center for Mathematical Sciences in Edinburgh recently launched its “Mathematics for Humanity” initiative with a call for research activity proposals (ranging from small collaborations to courses, workshops and conferences) aimed at using mathematics to contributing to the betterment of humanity. (I have agreed to serve on the scientific committee to evaluate these proposals.) We launched this initiative in January and initially set the deadline for April 15, but several people who had expressed interest felt that this was insufficient time to prepare a quality proposal, so we have now extended the deadline to June 1, and welcome further applications.

See also this Mathstodon post from fellow committee member John Baez last year where he solicited some preliminary suggestions for proposals, and my previous Mathstodon announcement of this programme.

This is a somewhat experimental and speculative post. This week I was at the IPAM workshop on machine assisted proof that I was one of the organizers of. We had an interesting and diverse range of talks, both from computer scientists presenting the latest available tools to formally verify proofs or to automate various aspects of proof writing or proof discovery, as well as mathematicians who described their experiences using these tools to solve their research problems. One can find the videos of these talks on the IPAM youtube channel; I also posted about the talks during the event on my Mathstodon account. I am of course not the most objective person to judge, but from the feedback I received it seems that the conference was able to successfully achieve its aim of bringing together the different communities interested in this topic.

As a result of the conference I started thinking about what possible computer tools might now be developed that could be of broad use to mathematicians, particularly those who do not have prior expertise with the finer aspects of writing code or installing software. One idea that came to mind was a potential tool to could take, say, an arXiv preprint as input, and return some sort of diagram detailing the logical flow of the main theorems and lemmas in the paper. This is currently done by hand by authors in some, but not all, papers (and can often also be automatically generated from formally verified proofs, as seen for instance in the graphic accompanying the IPAM workshop, or this diagram generated from Massot’s blueprint software from a manually inputted set of theorems and dependencies as a precursor to formalization of a proof [thanks to Thomas Bloom for this example]). For instance, here is a diagram that my co-author Rachel Greenfeld and I drew for a recent paper:

This particular diagram incorporated a number of subjective design choices regarding layout, which results to be designated important enough to require a dedicated box (as opposed to being viewed as a mere tool to get from one box to another), and how to describe each of these results (and how to colour-code them). This is still a very human-intensive task (and my co-author and I went through several iterations of this particular diagram with much back-and-forth discussion until we were both satisfied). But I could see the possibility of creating an automatic tool that could provide an initial “first approximation” to such a diagram, which a human user could then modify as they see fit (perhaps using some convenient GUI interface, for instance some variant of the Quiver online tool for drawing commutative diagrams in LaTeX).

As a crude first attempt at automatically generating such a diagram, one couuld perhaps develop a tool to scrape a LaTeX file to locate all the instances of the theorem environment in the text (i.e., all the formally identified lemmas, corollaries, and so forth), and for each such theorem, locate a proof environment instance that looks like it is associated to that theorem (doing this with reasonable accuracy may require a small amount of machine learning, though perhaps one could just hope that proximity of the proof environment instance to the theorem environment instance suffices in many cases). Then identify all the references within that proof environment to other theorems to start building the tree of implications, which one could then depict in a diagram such as the above. Such an approach would likely miss many of the implications; for instance, because many lemmas might not be proven using a formal proof environment, but instead by some more free-flowing text discussion, or perhaps a one line justification such as “By combining Lemma 3.4 and Proposition 3.6, we conclude”. Also, some references to other results in the paper might not proceed by direct citation, but by more indirect justifications such as “invoking the previous lemma, we obtain” or “by repeating the arguments in Section 3, we have”. Still, even such a crude diagram might still be helpful, both as a starting point for authors to make an improved diagram, or for a student trying to understand a lengthy paper to get some initial idea of the logical structure.

More advanced features might be to try to use more of the text of the paper to assign some measure of importance to individual results (and then weight the diagram correspondingly to highlight the more important results), to try to give each result a natural language description, and to somehow capture key statements that are not neatly encapsulated in a theorem environment instance, but I would imagine that such tasks should be deferred until some cruder proof-of-concept prototype can be demonstrated.

Anyway, I would be interested to hear opinions about whether this idea (or some modification thereof) is (a) actually feasible with current technology (or better yet, already exists in some form), and (b) of interest to research mathematicians.

Tamar Ziegler and I have just uploaded to the arXiv our paper “Infinite partial sumsets in the primes“. This is a short paper inspired by a recent result of Kra, Moreira, Richter, and Robertson (discussed for instance in this Quanta article from last December) showing that for any set ${A}$ of natural numbers of positive upper density, there exists a sequence ${b_1 < b_2 < b_3 < \dots}$ of natural numbers and a shift ${t}$ such that ${b_i + b_j + t \in A}$ for all ${i this answers a question of Erdős). In view of the “transference principle“, it is then plausible to ask whether the same result holds if ${A}$ is replaced by the primes. We can show the following results:

Theorem 1
• (i) If the Hardy-Littlewood prime tuples conjecture (or the weaker conjecture of Dickson) is true, then there exists an increasing sequence ${b_1 < b_2 < b_3 < \dots}$ of primes such that ${b_i + b_j + 1}$ is prime for all ${i < j}$.
• (ii) Unconditionally, there exist increasing sequences ${a_1 < a_2 < \dots}$ and ${b_1 < b_2 < \dots}$ of natural numbers such that ${a_i + b_j}$ is prime for all ${i.
• (iii) These conclusions fail if “prime” is replaced by “positive (relative) density subset of the primes” (even if the density is equal to 1).

We remark that it was shown by Balog that there (unconditionally) exist arbitrarily long but finite sequences ${b_1 < \dots < b_k}$ of primes such that ${b_i + b_j + 1}$ is prime for all ${i < j \leq k}$. (This result can also be recovered from the later results of Ben Green, myself, and Tamar Ziegler.) Also, it had previously been shown by Granville that on the Hardy-Littlewood prime tuples conjecture, there existed increasing sequences ${a_1 < a_2 < \dots}$ and ${b_1 < b_2 < \dots}$ of natural numbers such that ${a_i+b_j}$ is prime for all ${i,j}$.

The conclusion of (i) is stronger than that of (ii) (which is of course consistent with the former being conditional and the latter unconditional). The conclusion (ii) also implies the well-known theorem of Maynard that for any given ${k}$, there exist infinitely many ${k}$-tuples of primes of bounded diameter, and indeed our proof of (ii) uses the same “Maynard sieve” that powers the proof of that theorem (though we use a formulation of that sieve closer to that in this blog post of mine). Indeed, the failure of (iii) basically arises from the failure of Maynard’s theorem for dense subsets of primes, simply by removing those clusters of primes that are unusually closely spaced.

Our proof of (i) was initially inspired by the topological dynamics methods used by Kra, Moreira, Richter, and Robertson, but we managed to condense it to a purely elementary argument (taking up only half a page) that makes no reference to topological dynamics and builds up the sequence ${b_1 < b_2 < \dots}$ recursively by repeated application of the prime tuples conjecture.

The proof of (ii) takes up the majority of the paper. It is easiest to phrase the argument in terms of “prime-producing tuples” – tuples ${(h_1,\dots,h_k)}$ for which there are infinitely many ${n}$ with ${n+h_1,\dots,n+h_k}$ all prime. Maynard’s theorem is equivalent to the existence of arbitrarily long prime-producing tuples; our theorem is equivalent to the stronger assertion that there exist an infinite sequence ${h_1 < h_2 < \dots}$ such that every initial segment ${(h_1,\dots,h_k)}$ is prime-producing. The main new tool for achieving this is the following cute measure-theoretic lemma of Bergelson:

Lemma 2 (Bergelson intersectivity lemma) Let ${E_1,E_2,\dots}$ be subsets of a probability space ${(X,\mu)}$ of measure uniformly bounded away from zero, thus ${\inf_i \mu(E_i) > 0}$. Then there exists a subsequence ${E_{i_1}, E_{i_2}, \dots}$ such that

$\displaystyle \mu(E_{i_1} \cap \dots \cap E_{i_k} ) > 0$

for all ${k}$.

This lemma has a short proof, though not an entirely obvious one. Firstly, by deleting a null set from ${X}$, one can assume that all finite intersections ${E_{i_1} \cap \dots \cap E_{i_k}}$ are either positive measure or empty. Secondly, a routine application of Fatou’s lemma shows that the maximal function ${\limsup_N \frac{1}{N} \sum_{i=1}^N 1_{E_i}}$ has a positive integral, hence must be positive at some point ${x_0}$. Thus there is a subsequence ${E_{i_1}, E_{i_2}, \dots}$ whose finite intersections all contain ${x_0}$, thus have positive measure as desired by the previous reduction.

It turns out that one cannot quite combine the standard Maynard sieve with the intersectivity lemma because the events ${E_i}$ that show up (which roughly correspond to the event that ${n + h_i}$ is prime for some random number ${n}$ (with a well-chosen probability distribution) and some shift ${h_i}$) have their probability going to zero, rather than being uniformly bounded from below. To get around this, we borrow an idea from a paper of Banks, Freiberg, and Maynard, and group the shifts ${h_i}$ into various clusters ${h_{i,1},\dots,h_{i,J_1}}$, chosen in such a way that the probability that at least one of ${n+h_{i,1},\dots,n+h_{i,J_1}}$ is prime is bounded uniformly from below. One then applies the Bergelson intersectivity lemma to those events and uses many applications of the pigeonhole principle to conclude.

Over the last few years, I have served on a committee of the National Academy of Sciences to produce some posters and other related media to showcase twenty-first century and its applications in the real world, suitable for display in classrooms or math departments. Our posters (together with some associated commentary, webinars on related topics, and even a whimsical “comic“) are now available for download here.

This post is an unofficial sequel to one of my first blog posts from 2007, which was entitled “Quantum mechanics and Tomb Raider“.

One of the oldest and most famous allegories is Plato’s allegory of the cave. This allegory centers around a group of people chained to a wall in a cave that cannot see themselves or each other, but only the two-dimensional shadows of themselves cast on the wall in front of them by some light source they cannot directly see. Because of this, they identify reality with this two-dimensional representation, and have significant conceptual difficulties in trying to view themselves (or the world as a whole) as three-dimensional, until they are freed from the cave and able to venture into the sunlight.

There is a similar conceptual difficulty when trying to understand Einstein’s theory of special relativity (and more so for general relativity, but let us focus on special relativity for now). We are very much accustomed to thinking of reality as a three-dimensional space endowed with a Euclidean geometry that we traverse through in time, but in order to have the clearest view of the universe of special relativity it is better to think of reality instead as a four-dimensional spacetime that is endowed instead with a Minkowski geometry, which mathematically is similar to a (four-dimensional) Euclidean space but with a crucial change of sign in the underlying metric. Indeed, whereas the distance ${ds}$ between two points in Euclidean space ${{\bf R}^3}$ is given by the three-dimensional Pythagorean theorem

$\displaystyle ds^2 = dx^2 + dy^2 + dz^2$

under some standard Cartesian coordinate system ${(x,y,z)}$ of that space, and the distance ${ds}$ in a four-dimensional Euclidean space ${{\bf R}^4}$ would be similarly given by

$\displaystyle ds^2 = dx^2 + dy^2 + dz^2 + du^2$

under a standard four-dimensional Cartesian coordinate system ${(x,y,z,u)}$, the spacetime interval ${ds}$ in Minkowski space is given by

$\displaystyle ds^2 = dx^2 + dy^2 + dz^2 - c^2 dt^2$

(though in many texts the opposite sign convention ${ds^2 = -dx^2 -dy^2 - dz^2 + c^2dt^2}$ is preferred) in spacetime coordinates ${(x,y,z,t)}$, where ${c}$ is the speed of light. The geometry of Minkowski space is then quite similar algebraically to the geometry of Euclidean space (with the sign change replacing the traditional trigonometric functions ${\sin, \cos, \tan}$, etc. by their hyperbolic counterparts ${\sinh, \cosh, \tanh}$, and with various factors involving “${c}$” inserted in the formulae), but also has some qualitative differences to Euclidean space, most notably a causality structure connected to light cones that has no obvious counterpart in Euclidean space.

That said, the analogy between Minkowski space and four-dimensional Euclidean space is strong enough that it serves as a useful conceptual aid when first learning special relativity; for instance the excellent introductory text “Spacetime physics” by Taylor and Wheeler very much adopts this view. On the other hand, this analogy doesn’t directly address the conceptual problem mentioned earlier of viewing reality as a four-dimensional spacetime in the first place, rather than as a three-dimensional space that objects move around in as time progresses. Of course, part of the issue is that we aren’t good at directly visualizing four dimensions in the first place. This latter problem can at least be easily addressed by removing one or two spatial dimensions from this framework – and indeed many relativity texts start with the simplified setting of only having one spatial dimension, so that spacetime becomes two-dimensional and can be depicted with relative ease by spacetime diagrams – but still there is conceptual resistance to the idea of treating time as another spatial dimension, since we clearly cannot “move around” in time as freely as we can in space, nor do we seem able to easily “rotate” between the spatial and temporal axes, the way that we can between the three coordinate axes of Euclidean space.

With this in mind, I thought it might be worth attempting a Plato-type allegory to reconcile the spatial and spacetime views of reality, in a way that can be used to describe (analogues of) some of the less intuitive features of relativity, such as time dilation, length contraction, and the relativity of simultaneity. I have (somewhat whimsically) decided to place this allegory in a Tolkienesque fantasy world (similarly to how my previous allegory to describe quantum mechanics was phrased in a world based on the computer game “Tomb Raider”). This is something of an experiment, and (like any other analogy) the allegory will not be able to perfectly capture every aspect of the phenomenon it is trying to represent, so any feedback to improve the allegory would be appreciated.