The (now virtual) 2022 International Congress of Mathematicians, which will be held on July 6-14, now has open registration (free of charge).

I’ll also take this opportunity to mention that there are a large number of supporting satellite events for the virtual ICM, which are listed on this web page. I’d like to draw particular attention to the public lecture satellite event, now hosted by the London Mathematical Society, that will feature three speakers:

(As with many other of the satellite events, these public lectures will require a separate registration from that of the main ICM.)

Let ${G}$ be a finite set of order ${N}$; in applications ${G}$ will be typically something like a finite abelian group, such as the cyclic group ${{\bf Z}/N{\bf Z}}$. Let us define a ${1}$-bounded function to be a function ${f: G \rightarrow {\bf C}}$ such that ${|f(n)| \leq 1}$ for all ${n \in G}$. There are many seminorms ${\| \|}$ of interest that one places on functions ${f: G \rightarrow {\bf C}}$ that are bounded by ${1}$ on ${1}$-bounded functions, such as the Gowers uniformity seminorms ${\| \|_k}$ for ${k \geq 1}$ (which are genuine norms for ${k \geq 2}$). All seminorms in this post will be implicitly assumed to obey this property.

In additive combinatorics, a significant role is played by inverse theorems, which abstractly take the following form for certain choices of seminorm ${\| \|}$, some parameters ${\eta, \varepsilon>0}$, and some class ${{\mathcal F}}$ of ${1}$-bounded functions:

Theorem 1 (Inverse theorem template) If ${f}$ is a ${1}$-bounded function with ${\|f\| \geq \eta}$, then there exists ${F \in {\mathcal F}}$ such that ${|\langle f, F \rangle| \geq \varepsilon}$, where ${\langle,\rangle}$ denotes the usual inner product

$\displaystyle \langle f, F \rangle := {\bf E}_{n \in G} f(n) \overline{F(n)}.$

Informally, one should think of ${\eta}$ as being somewhat small but fixed independently of ${N}$, ${\varepsilon}$ as being somewhat smaller but depending only on ${\eta}$ (and on the seminorm), and ${{\mathcal F}}$ as representing the “structured functions” for these choices of parameters. There is some flexibility in exactly how to choose the class ${{\mathcal F}}$ of structured functions, but intuitively an inverse theorem should become more powerful when this class is small. Accordingly, let us define the ${(\eta,\varepsilon)}$-entropy of the seminorm ${\| \|}$ to be the least cardinality of ${{\mathcal F}}$ for which such an inverse theorem holds. Seminorms with low entropy are ones for which inverse theorems can be expected to be a useful tool. This concept arose in some discussions I had with Ben Green many years ago, but never appeared in print, so I decided to record some observations we had on this concept here on this blog.

Lebesgue norms ${\| f\|_{L^p} := ({\bf E}_{n \in G} |f(n)|^p)^{1/p}}$ for ${1 < p < \infty}$ have exponentially large entropy (and so inverse theorems are not expected to be useful in this case):

Proposition 2 (${L^p}$ norm has exponentially large inverse entropy) Let ${1 < p < \infty}$ and ${0 < \eta < 1}$. Then the ${(\eta,\eta^p/4)}$-entropy of ${\| \|_{L^p}}$ is at most ${(1+8/\eta^p)^N}$. Conversely, for any ${\varepsilon>0}$, the ${(\eta,\varepsilon)}$-entropy of ${\| \|_{L^p}}$ is at least ${\exp( c \varepsilon^2 N)}$ for some absolute constant ${c>0}$.

Proof: If ${f}$ is ${1}$-bounded with ${\|f\|_{L^p} \geq \eta}$, then we have

$\displaystyle |\langle f, |f|^{p-2} f \rangle| \geq \eta^p$

and hence by the triangle inequality we have

$\displaystyle |\langle f, F \rangle| \geq \eta^p/2$

where ${F}$ is either the real or imaginary part of ${|f|^{p-2} f}$, which takes values in ${[-1,1]}$. If we let ${\tilde F}$ be ${F}$ rounded to the nearest multiple of ${\eta^p/4}$, then by the triangle inequality again we have

$\displaystyle |\langle f, \tilde F \rangle| \geq \eta^p/4.$

There are only at most ${1+8/\eta^p}$ possible values for each value ${\tilde F(n)}$ of ${\tilde F}$, and hence at most ${(1+8/\eta^p)^N}$ possible choices for ${\tilde F}$. This gives the first claim.

Now suppose that there is an ${(\eta,\varepsilon)}$-inverse theorem for some ${{\mathcal F}}$ of cardinality ${M}$. If we let ${f}$ be a random sign function (so the ${f(n)}$ are independent random variables taking values in ${-1,+1}$ with equal probability), then there is a random ${F \in {\mathcal F}}$ such that

$\displaystyle |\langle f, F \rangle| \geq \varepsilon$

and hence by the pigeonhole principle there is a deterministic ${F \in {\mathcal F}}$ such that

$\displaystyle {\bf P}( |\langle f, F \rangle| \geq \varepsilon ) \geq 1/M.$

On the other hand, from the Hoeffding inequality one has

$\displaystyle {\bf P}( |\langle f, F \rangle| \geq \varepsilon ) \ll \exp( - c \varepsilon^2 N )$

for some absolute constant ${c}$, hence

$\displaystyle M \geq \exp( c \varepsilon^2 N )$

as claimed. $\Box$

Most seminorms of interest in additive combinatorics, such as the Gowers uniformity norms, are bounded by some finite ${L^p}$ norm thanks to Hölder’s inequality, so from the above proposition and the obvious monotonicity properties of entropy, we conclude that all Gowers norms on finite abelian groups ${G}$ have at most exponential inverse theorem entropy. But we can do significantly better than this:

• For the ${U^1}$ seminorm ${\|f\|_{U^1(G)} := |{\bf E}_{n \in G} f(n)|}$, one can simply take ${{\mathcal F} = \{1\}}$ to consist of the constant function ${1}$, and the ${(\eta,\eta)}$-entropy is clearly equal to ${1}$ for any ${0 < \eta < 1}$.
• For the ${U^2}$ norm, the standard Fourier-analytic inverse theorem asserts that if ${\|f\|_{U^2(G)} \geq \eta}$ then ${|\langle f, e(\xi \cdot) \rangle| \geq \eta^2}$ for some Fourier character ${\xi \in \hat G}$. Thus the ${(\eta,\eta^2)}$-entropy is at most ${N}$.
• For the ${U^k({\bf Z}/N{\bf Z})}$ norm on cyclic groups for ${k > 2}$, the inverse theorem proved by Green, Ziegler, and myself gives an ${(\eta,\varepsilon)}$-inverse theorem for some ${\varepsilon \gg_{k,\eta} 1}$ and ${{\mathcal F}}$ consisting of nilsequences ${n \mapsto F(g(n) \Gamma)}$ for some filtered nilmanifold ${G/\Gamma}$ of degree ${k-1}$ in a finite collection of cardinality ${O_{\eta,k}(1)}$, some polynomial sequence ${g: {\bf Z} \rightarrow G}$ (which was subsequently observed by Candela-Sisask (see also Manners) that one can choose to be ${N}$-periodic), and some Lipschitz function ${F: G/\Gamma \rightarrow {\bf C}}$ of Lipschitz norm ${O_{\eta,k}(1)}$. By the Arzela-Ascoli theorem, the number of possible ${F}$ (up to uniform errors of size at most ${\varepsilon/2}$, say) is ${O_{\eta,k}(1)}$. By standard arguments one can also ensure that the coefficients of the polynomial ${g}$ are ${O_{\eta,k}(1)}$, and then by periodicity there are only ${O(N^{O_{\eta,k}(1)}}$ such polynomials. As a consequence, the ${(\eta,\varepsilon)}$-entropy is of polynomial size ${O_{\eta,k}( N^{O_{\eta,k}(1)} )}$ (a fact that seems to have first been implicitly observed in Lemma 6.2 of this paper of Frantzikinakis; thanks to Ben Green for this reference). One can obtain more precise dependence on ${\eta,k}$ using the quantitative version of this inverse theorem due to Manners; back of the envelope calculations using Section 5 of that paper suggest to me that one can take ${\varepsilon = \eta^{O_k(1)}}$ to be polynomial in ${\eta}$ and the entropy to be of the order ${O_k( N^{\exp(\exp(\eta^{-O_k(1)}))} )}$, or alternatively one can reduce the entropy to ${O_k( \exp(\exp(\eta^{-O_k(1)})) N^{\eta^{-O_k(1)}})}$ at the cost of degrading ${\varepsilon}$ to ${1/\exp\exp( O(\eta^{-O(1)}))}$.
• If one replaces the cyclic group ${{\bf Z}/N{\bf Z}}$ by a vector space ${{\bf F}_p^n}$ over some fixed finite field ${{\bf F}_p}$ of prime order (so that ${N=p^n}$), then the inverse theorem of Ziegler and myself (available in both high and low characteristic) allows one to obtain an ${(\eta,\varepsilon)}$-inverse theorem for some ${\varepsilon \gg_{k,\eta} 1}$ and ${{\mathcal F}}$ the collection of non-classical degree ${k-1}$ polynomial phases from ${{\bf F}_p^n}$ to ${S^1}$, which one can normalize to equal ${1}$ at the origin, and then by the classification of such polynomials one can calculate that the ${(\eta,\varepsilon)}$ entropy is of quasipolynomial size ${\exp( O_{p,k}(n^{k-1}) ) = \exp( O_{p,k}( \log^{k-1} N ) )}$ in ${N}$. By using the recent work of Gowers and Milicevic, one can make the dependence on ${p,k}$ here more precise, but we will not perform these calcualtions here.
• For the ${U^3(G)}$ norm on an arbitrary finite abelian group, the recent inverse theorem of Jamneshan and myself gives (after some calculations) a bound of the polynomial form ${O( q^{O(n^2)} N^{\exp(\eta^{-O(1)})})}$ on the ${(\eta,\varepsilon)}$-entropy for some ${\varepsilon \gg \eta^{O(1)}}$, which one can improve slightly to ${O( q^{O(n^2)} N^{\eta^{-O(1)}})}$ if one degrades ${\varepsilon}$ to ${1/\exp(\eta^{-O(1)})}$, where ${q}$ is the maximal order of an element of ${G}$, and ${n}$ is the rank (the number of elements needed to generate ${G}$). This bound is polynomial in ${N}$ in the cyclic group case and quasipolynomial in general.

For general finite abelian groups ${G}$, we do not yet have an inverse theorem of comparable power to the ones mentioned above that give polynomial or quasipolynomial upper bounds on the entropy. However, there is a cheap argument that at least gives some subexponential bounds:

Proposition 3 (Cheap subexponential bound) Let ${k \geq 2}$ and ${0 < \eta < 1/2}$, and suppose that ${G}$ is a finite abelian group of order ${N \geq \eta^{-C_k}}$ for some sufficiently large ${C_k}$. Then the ${(\eta,c_k \eta^{O_k(1)})}$-complexity of ${\| \|_{U^k(G)}}$ is at most ${O( \exp( \eta^{-O_k(1)} N^{1 - \frac{k+1}{2^k-1}} ))}$.

Proof: (Sketch) We use a standard random sampling argument, of the type used for instance by Croot-Sisask or Briet-Gopi (thanks to Ben Green for this latter reference). We can assume that ${N \geq \eta^{-C_k}}$ for some sufficiently large ${C_k>0}$, since otherwise the claim follows from Proposition 2.

Let ${A}$ be a random subset of ${{\bf Z}/N{\bf Z}}$ with the events ${n \in A}$ being iid with probability ${0 < p < 1}$ to be chosen later, conditioned to the event ${|A| \leq 2pN}$. Let ${f}$ be a ${1}$-bounded function. By a standard second moment calculation, we see that with probability at least ${1/2}$, we have

$\displaystyle \|f\|_{U^k(G)}^{2^k} = {\bf E}_{n, h_1,\dots,h_k \in G} f(n) \prod_{\omega \in \{0,1\}^k \backslash \{0\}} {\mathcal C}^{|\omega|} \frac{1}{p} 1_A f(n + \omega \cdot h)$

$\displaystyle + O((\frac{1}{N^{k+1} p^{2^k-1}})^{1/2}).$

Thus, by the triangle inequality, if we choose ${p := C \eta^{-2^{k+1}/(2^k-1)} / N^{\frac{k+1}{2^k-1}}}$ for some sufficiently large ${C = C_k > 0}$, then for any ${1}$-bounded ${f}$ with ${\|f\|_{U^k(G)} \geq \eta/2}$, one has with probability at least ${1/2}$ that

$\displaystyle |{\bf E}_{n, h_1,\dots,h_k \i2^n G} f(n) \prod_{\omega \in \{0,1\}^k \backslash \{0\}} {\mathcal C}^{|\omega|} \frac{1}{p} 1_A f(n + \omega \cdot h)|$

$\displaystyle \geq \eta^{2^k}/2^{2^k+1}.$

We can write the left-hand side as ${|\langle f, F \rangle|}$ where ${F}$ is the randomly sampled dual function

$\displaystyle F(n) := {\bf E}_{n, h_1,\dots,h_k \in G} f(n) \prod_{\omega \in \{0,1\}^k \backslash \{0\}} {\mathcal C}^{|\omega|+1} \frac{1}{p} 1_A f(n + \omega \cdot h).$

Unfortunately, ${F}$ is not ${1}$-bounded in general, but we have

$\displaystyle \|F\|_{L^2(G)}^2 \leq {\bf E}_{n, h_1,\dots,h_k ,h'_1,\dots,h'_k \in G}$

$\displaystyle \prod_{\omega \in \{0,1\}^k \backslash \{0\}} \frac{1}{p} 1_A(n + \omega \cdot h) \frac{1}{p} 1_A(n + \omega \cdot h')$

and the right-hand side can be shown to be ${1+o(1)}$ on the average, so we can condition on the event that the right-hand side is ${O(1)}$ without significant loss in falure probability.

If we then let ${\tilde f_A}$ be ${1_A f}$ rounded to the nearest Gaussian integer multiple of ${\eta^{2^k}/2^{2^{10k}}}$ in the unit disk, one has from the triangle inequality that

$\displaystyle |\langle f, \tilde F \rangle| \geq \eta^{2^k}/2^{2^k+2}$

where ${\tilde F}$ is the discretised randomly sampled dual function

$\displaystyle \tilde F(n) := {\bf E}_{n, h_1,\dots,h_k \in G} f(n) \prod_{\omega \in \{0,1\}^k \backslash \{0\}} {\mathcal C}^{|\omega|+1} \frac{1}{p} \tilde f_A(n + \omega \cdot h).$

For any given ${A}$, there are at most ${2np}$ places ${n}$ where ${\tilde f_A(n)}$ can be non-zero, and in those places there are ${O_k( \eta^{-2^{k}})}$ possible values for ${\tilde f_A(n)}$. Thus, if we let ${{\mathcal F}_A}$ be the collection of all possible ${\tilde f_A}$ associated to a given ${A}$, the cardinality of this set is ${O( \exp( \eta^{-O_k(1)} N^{1 - \frac{k+1}{2^k-1}} ) )}$, and for any ${f}$ with ${\|f\|_{U^k(G)} \geq \eta/2}$, we have

$\displaystyle \sup_{\tilde F \in {\mathcal F}_A} |\langle f, \tilde F \rangle| \geq \eta^{2^k}/2^{k+2}$

with probability at least ${1/2}$.

Now we remove the failure probability by independent resampling. By rounding to the nearest Gaussian integer multiple of ${c_k \eta^{2^k}}$ in the unit disk for a sufficiently small ${c_k>0}$, one can find a family ${{\mathcal G}}$ of cardinality ${O( \eta^{-O_k(N)})}$ consisting of ${1}$-bounded functions ${\tilde f}$ of ${U^k(G)}$ norm at least ${\eta/2}$ such that for every ${1}$-bounded ${f}$ with ${\|f\|_{U^k(G)} \geq \eta}$ there exists ${\tilde f \in {\mathcal G}}$ such that

$\displaystyle \|f-\tilde f\|_{L^\infty(G)} \leq \eta^{2^k}/2^{k+3}.$

Now, let ${A_1,\dots,A_M}$ be independent samples of ${A}$ for some ${M}$ to be chosen later. By the preceding discussion, we see that with probability at least ${1 - 2^{-M}}$, we have

$\displaystyle \sup_{\tilde F \in \bigcup_{j=1}^M {\mathcal F}_{A_j}} |\langle \tilde f, \tilde F \rangle| \geq \eta^{2^k}/2^{k+2}$

for any given ${\tilde f \in {\mathcal G}}$, so by the union bound, if we choose ${M = \lfloor C N \log \frac{1}{\eta} \rfloor}$ for a large enough ${C = C_k}$, we can find ${A_1,\dots,A_M}$ such that

$\displaystyle \sup_{\tilde F \in \bigcup_{j=1}^M {\mathcal F}_{A_j}} |\langle \tilde f, \tilde F \rangle| \geq \eta^{2^k}/2^{k+2}$

for all ${\tilde f \in {\mathcal G}}$, and hence y the triangle inequality

$\displaystyle \sup_{\tilde F \in \bigcup_{j=1}^M {\mathcal F}_{A_j}} |\langle f, \tilde F \rangle| \geq \eta^{2^k}/2^{k+3}.$

Taking ${{\mathcal F}}$ to be the union of the ${{\mathcal F}_{A_j}}$ (applying some truncation and rescaling to these ${L^2}$-bounded functions to make them ${L^\infty}$-bounded, and then ${1}$-bounded), we obtain the claim. $\Box$

One way to obtain lower bounds on the inverse theorem entropy is to produce a collection of almost orthogonal functions with large norm. More precisely:

Proposition 4 Let ${\| \|}$ be a seminorm, let ${0 < \varepsilon \leq \eta < 1}$, and suppose that one has a collection ${f_1,\dots,f_M}$ of ${1}$-bounded functions such that for all ${i=1,\dots,M}$, ${\|f_i\| \geq \eta}$ one has ${|\langle f_i, f_j \rangle| \leq \varepsilon^2/2}$ for all but at most ${L}$ choices of ${j \in \{1,\dots,M\}}$ for all distinct ${i,j \in \{1,\dots,M\}}$. Then the ${(\eta, \varepsilon)}$-entropy of ${\| \|}$ is at least ${\varepsilon^2 M / 2L}$.

Proof: Suppose we have an ${(\eta,\varepsilon)}$-inverse theorem with some family ${{\mathcal F}}$. Then for each ${i=1,\dots,M}$ there is ${F_i \in {\mathcal F}}$ such that ${|\langle f_i, F_i \rangle| \geq \varepsilon}$. By the pigeonhole principle, there is thus ${F \in {\mathcal F}}$ such that ${|\langle f_i, F \rangle| \geq \varepsilon}$ for all ${i}$ in a subset ${I}$ of ${\{1,\dots,M\}}$ of cardinality at least ${M/|{\mathcal F}|}$:

$\displaystyle |I| \geq M / |{\mathcal F}|.$

We can sum this to obtain

$\displaystyle |\sum_{i \in I} c_i \langle f_i, F \rangle| \geq |I| \varepsilon$

for some complex numbers ${c_i}$ of unit magnitude. By Cauchy-Schwarz, this implies

$\displaystyle \| \sum_{i \in I} c_i f_i \|_{L^2(G)}^2 \geq |I|^2 \varepsilon^2$

and hence by the triangle inequality

$\displaystyle \sum_{i,j \in I} |\langle f_i, f_j \rangle| \geq |I|^2 \varepsilon^2.$

On the other hand, by hypothesis we can bound the left-hand side by ${|I| (L + \varepsilon^2 |I|/2)}$. Rearranging, we conclude that

$\displaystyle |I| \leq 2 L / \varepsilon^2$

and hence

$\displaystyle |{\mathcal F}| \geq \varepsilon^2 M / 2L$

giving the claim. $\Box$

Thus for instance:

• For the ${U^2(G)}$ norm, one can take ${f_1,\dots,f_M}$ to be the family of linear exponential phases ${n \mapsto e(\xi \cdot n)}$ with ${M = N}$ and ${L=1}$, and obtain a linear lower bound of ${\varepsilon^2 N/2}$ for the ${(\eta,\varepsilon)}$-entropy, thus matching the upper bound of ${N}$ up to constants when ${\varepsilon}$ is fixed.
• For the ${U^k({\bf Z}/N{\bf Z})}$ norm, a similar calculation using polynomial phases of degree ${k-1}$, combined with the Weyl sum estimates, gives a lower bound of ${\gg_{k,\varepsilon} N^{k-1}}$ for the ${(\eta,\varepsilon)}$-entropy for any fixed ${\eta,\varepsilon}$; by considering nilsequences as well, together with nilsequence equidistribution theory, one can replace the exponent ${k-1}$ here by some quantity that goes to infinity as ${\eta \rightarrow 0}$, though I have not attempted to calculate the exact rate.
• For the ${U^k({\bf F}_p^n)}$ norm, another similar calculation using polynomial phases of degree ${k-1}$ should give a lower bound of ${\gg_{p,k,\eta,\varepsilon} \exp( c_{p,k,\eta,\varepsilon} n^{k-1} )}$ for the ${(\eta,\varepsilon)}$-entropy, though I have not fully performed the calculation.

We close with one final example. Suppose ${G}$ is a product ${G = A \times B}$ of two sets ${A,B}$ of cardinality ${\asymp \sqrt{N}}$, and we consider the Gowers box norm

$\displaystyle \|f\|_{\Box^2(G)}^4 := {\bf E}_{a,a' \in A; b,b' \in B} f(a,b) \overline{f}(a,b') \overline{f}(a',b) f(a,b).$

One possible choice of class ${{\mathcal F}}$ here are the indicators ${1_{U \times V}}$ of “rectangles” ${U \times V}$ with ${U \subset A}$, ${V \subset B}$ (cf. this previous blog post on cut norms). By standard calculations, one can use this class to show that the ${(\eta, \eta^4/10)}$-entropy of ${\| \|_{\Box^2(G)}}$ is ${O( \exp( O(\sqrt{N}) )}$, and a variant of the proof of the second part of Proposition 2 shows that this is the correct order of growth in ${N}$. In contrast, a modification of Proposition 3 only gives an upper bound of the form ${O( \exp( O( N^{2/3} ) ) )}$ (the bottleneck is ensuring that the randomly sampled dual functions stay bounded in ${L^2}$), which shows that while this cheap bound is not optimal, it can still broadly give the correct “type” of bound (specifically, intermediate growth between polynomial and exponential).

In orthodox first-order logic, variables and expressions are only allowed to take one value at a time; a variable ${x}$, for instance, is not allowed to equal ${+3}$ and ${-3}$ simultaneously. We will call such variables completely specified. If one really wants to deal with multiple values of objects simultaneously, one is encouraged to use the language of set theory and/or logical quantifiers to do so.

However, the ability to allow expressions to become only partially specified is undeniably convenient, and also rather intuitive. A classic example here is that of the quadratic formula:

$\displaystyle \hbox{If } x,a,b,c \in {\bf R} \hbox{ with } a \neq 0, \hbox{ then }$

$\displaystyle ax^2+bx+c=0 \hbox{ if and only if } x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}. \ \ \ \ \ (1)$

Strictly speaking, the expression ${x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}}$ is not well-formed according to the grammar of first-order logic; one should instead use something like

$\displaystyle x = \frac{-b - \sqrt{b^2-4ac}}{2a} \hbox{ or } x = \frac{-b + \sqrt{b^2-4ac}}{2a}$

or

$\displaystyle x \in \left\{ \frac{-b - \sqrt{b^2-4ac}}{2a}, \frac{-b + \sqrt{b^2-4ac}}{2a} \right\}$

or

$\displaystyle x = \frac{-b + \epsilon \sqrt{b^2-4ac}}{2a} \hbox{ for some } \epsilon \in \{-1,+1\}$

in order to strictly adhere to this grammar. But none of these three reformulations are as compact or as conceptually clear as the original one. In a similar spirit, a mathematical English sentence such as

$\displaystyle \hbox{The sum of two odd numbers is an even number} \ \ \ \ \ (2)$

is also not a first-order sentence; one would instead have to write something like

$\displaystyle \hbox{For all odd numbers } x, y, \hbox{ the number } x+y \hbox{ is even} \ \ \ \ \ (3)$

or

$\displaystyle \hbox{For all odd numbers } x,y \hbox{ there exists an even number } z \ \ \ \ \ (4)$

$\displaystyle \hbox{ such that } x+y=z$

instead. These reformulations are not all that hard to decipher, but they do have the aesthetically displeasing effect of cluttering an argument with temporary variables such as ${x,y,z}$ which are used once and then discarded.

Another example of partially specified notation is the innocuous ${\ldots}$ notation. For instance, the assertion

$\displaystyle \pi=3.14\ldots,$

when written formally using first-order logic, would become something like

$\displaystyle \pi = 3 + \frac{1}{10} + \frac{4}{10^2} + \sum_{n=3}^\infty \frac{a_n}{10^n} \hbox{ for some sequence } (a_n)_{n=3}^\infty$

$\displaystyle \hbox{ with } a_n \in \{0,1,2,3,4,5,6,7,8,9\} \hbox{ for all } n,$

which is not exactly an elegant reformulation. Similarly with statements such as

$\displaystyle \tan x = x + \frac{x^3}{3} + \ldots \hbox{ for } |x| < \pi/2$

or

$\displaystyle \tan x = x + \frac{x^3}{3} + O(|x|^5) \hbox{ for } |x| < \pi/2.$

Below the fold I’ll try to assign a formal meaning to partially specified expressions such as (1), for instance allowing one to condense (2), (3), (4) to just

$\displaystyle \hbox{odd} + \hbox{odd} = \hbox{even}.$

When combined with another common (but often implicit) extension of first-order logic, namely the ability to reason using ambient parameters, we become able to formally introduce asymptotic notation such as the big-O notation ${O()}$ or the little-o notation ${o()}$. We will explain how to do this at the end of this post.

Kaisa Matomäki, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Higher uniformity of arithmetic functions in short intervals I. All intervals“. This paper investigates the higher order (Gowers) uniformity of standard arithmetic functions in analytic number theory (and specifically, the Möbius function ${\mu}$, the von Mangoldt function ${\Lambda}$, and the generalised divisor functions ${d_k}$) in short intervals ${(X,X+H]}$, where ${X}$ is large and ${H}$ lies in the range ${X^{\theta+\varepsilon} \leq H \leq X^{1-\varepsilon}}$ for a fixed constant ${0 < \theta < 1}$ (that one would like to be as small as possible). If we let ${f}$ denote one of the functions ${\mu, \Lambda, d_k}$, then there is extensive literature on the estimation of short sums

$\displaystyle \sum_{X < n \leq X+H} f(n)$

and some literature also on the estimation of exponential sums such as

$\displaystyle \sum_{X < n \leq X+H} f(n) e(-\alpha n)$

for a real frequency ${\alpha}$, where ${e(\theta) := e^{2\pi i \theta}}$. For applications in the additive combinatorics of such functions ${f}$, it is also necessary to consider more general correlations, such as polynomial correlations

$\displaystyle \sum_{X < n \leq X+H} f(n) e(-P(n))$

where ${P: {\bf Z} \rightarrow {\bf R}}$ is a polynomial of some fixed degree, or more generally

$\displaystyle \sum_{X < n \leq X+H} f(n) \overline{F}(g(n) \Gamma)$

where ${G/\Gamma}$ is a nilmanifold of fixed degree and dimension (and with some control on structure constants), ${g: {\bf Z} \rightarrow G}$ is a polynomial map, and ${F: G/\Gamma \rightarrow {\bf C}}$ is a Lipschitz function (with some bound on the Lipschitz constant). Indeed, thanks to the inverse theorem for the Gowers uniformity norm, such correlations let one control the Gowers uniformity norm of ${f}$ (possibly after subtracting off some renormalising factor) on such short intervals ${(X,X+H]}$, which can in turn be used to control other multilinear correlations involving such functions.

Traditionally, asymptotics for such sums are expressed in terms of a “main term” of some arithmetic nature, plus an error term that is estimated in magnitude. For instance, a sum such as ${\sum_{X < n \leq X+H} \Lambda(n) e(-\alpha n)}$ would be approximated in terms of a main term that vanished (or is negligible) if ${\alpha}$ is “minor arc”, but would be expressible in terms of something like a Ramanujan sum if ${\alpha}$ was “major arc”, together with an error term. We found it convenient to cancel off such main terms by subtracting an approximant ${f^\sharp}$ from each of the arithmetic functions ${f}$ and then getting upper bounds on remainder correlations such as

$\displaystyle |\sum_{X < n \leq X+H} (f(n)-f^\sharp(n)) \overline{F}(g(n) \Gamma)| \ \ \ \ \ (1)$

(actually for technical reasons we also allow the ${n}$ variable to be restricted further to a subprogression of ${(X,X+H]}$, but let us ignore this minor extension for this discussion). There is some flexibility in how to choose these approximants, but we eventually found it convenient to use the following choices.

• For the Möbius function ${\mu}$, we simply set ${\mu^\sharp = 0}$, as per the Möbius pseudorandomness conjecture. (One could choose a more sophisticated approximant in the presence of a Siegel zero, as I did with Joni in this recent paper, but we do not do so here.)
• For the von Mangoldt function ${\Lambda}$, we eventually went with the Cramér-Granville approximant ${\Lambda^\sharp(n) = \frac{W}{\phi(W)} 1_{(n,W)=1}}$, where ${W = \prod_{p < R} p}$ and ${R = \exp(\log^{1/10} X)}$.
• For the divisor functions ${d_k}$, we used a somewhat complicated-looking approximant ${d_k^\sharp(n) = \sum_{m \leq X^{\frac{k-1}{5k}}} P_m(\log n)}$ for some explicit polynomials ${P_m}$, chosen so that ${d_k^\sharp}$ and ${d_k}$ have almost exactly the same sums along arithmetic progressions (see the paper for details).

The objective is then to obtain bounds on sums such as (1) that improve upon the “trivial bound” that one can get with the triangle inequality and standard number theory bounds such as the Brun-Titchmarsh inequality. For ${\mu}$ and ${\Lambda}$, the Siegel-Walfisz theorem suggests that it is reasonable to expect error terms that have “strongly logarithmic savings” in the sense that they gain a factor of ${O_A(\log^{-A} X)}$ over the trivial bound for any ${A>0}$; for ${d_k}$, the Dirichlet hyperbola method suggests instead that one has “power savings” in that one should gain a factor of ${X^{-c_k}}$ over the trivial bound for some ${c_k>0}$. In the case of the Möbius function ${\mu}$, there is an additional trick (introduced by Matomäki and Teräväinen) that allows one to lower the exponent ${\theta}$ somewhat at the cost of only obtaining “weakly logarithmic savings” of shape ${\log^{-c} X}$ for some small ${c>0}$.

Our main estimates on sums of the form (1) work in the following ranges:

• For ${\theta=5/8}$, one can obtain strongly logarithmic savings on (1) for ${f=\mu,\Lambda}$, and power savings for ${f=d_k}$.
• For ${\theta=3/5}$, one can obtain weakly logarithmic savings for ${f = \mu, d_k}$.
• For ${\theta=5/9}$, one can obtain power savings for ${f=d_3}$.
• For ${\theta=1/3}$, one can obtain power savings for ${f=d_2}$.

Conjecturally, one should be able to obtain power savings in all cases, and lower ${\theta}$ down to zero, but the ranges of exponents and savings given here seem to be the limit of current methods unless one assumes additional hypotheses, such as GRH. The ${\theta=5/8}$ result for correlation against Fourier phases ${e(\alpha n)}$ was established previously by Zhan, and the ${\theta=3/5}$ result for such phases and ${f=\mu}$ was established previously by by Matomäki and Teräväinen.

By combining these results with tools from additive combinatorics, one can obtain a number of applications:

• Direct insertion of our bounds in the recent work of Kanigowski, Lemanczyk, and Radziwill on the prime number theorem on dynamical systems that are analytic skew products gives some improvements in the exponents there.
• We can obtain a “short interval” version of a multiple ergodic theorem along primes established by Frantzikinakis-Host-Kra and Wooley-Ziegler, in which we average over intervals of the form ${(X,X+H]}$ rather than ${[1,X]}$.
• We can obtain a “short interval” version of the “linear equations in primes” asymptotics obtained by Ben Green, Tamar Ziegler, and myself in this sequence of papers, where the variables in these equations lie in short intervals ${(X,X+H]}$ rather than long intervals such as ${[1,X]}$.

We now briefly discuss some of the ingredients of proof of our main results. The first step is standard, using combinatorial decompositions (based on the Heath-Brown identity and (for the ${\theta=3/5}$ result) the Ramaré identity) to decompose ${\mu(n), \Lambda(n), d_k(n)}$ into more tractable sums of the following types:

• Type ${I}$ sums, which are basically of the form ${\sum_{m \leq A:m|n} \alpha(m)}$ for some weights ${\alpha(m)}$ of controlled size and some cutoff ${A}$ that is not too large;
• Type ${II}$ sums, which are basically of the form ${\sum_{A_- \leq m \leq A_+:m|n} \alpha(m)\beta(n/m)}$ for some weights ${\alpha(m)}$, ${\beta(n)}$ of controlled size and some cutoffs ${A_-, A_+}$ that are not too close to ${1}$ or to ${X}$;
• Type ${I_2}$ sums, which are basically of the form ${\sum_{m \leq A:m|n} \alpha(m) d_2(n/m)}$ for some weights ${\alpha(m)}$ of controlled size and some cutoff ${A}$ that is not too large.

The precise ranges of the cutoffs ${A, A_-, A_+}$ depend on the choice of ${\theta}$; our methods fail once these cutoffs pass a certain threshold, and this is the reason for the exponents ${\theta}$ being what they are in our main results.

The Type ${I}$ sums involving nilsequences can be treated by methods similar to those in this previous paper of Ben Green and myself; the main innovations are in the treatment of the Type ${II}$ and Type ${I_2}$ sums.

For the Type ${II}$ sums, one can split into the “abelian” case in which (after some Fourier decomposition) the nilsequence ${F(g(n)\Gamma)}$ is basically of the form ${e(P(n))}$, and the “non-abelian” case in which ${G}$ is non-abelian and ${F}$ exhibits non-trivial oscillation in a central direction. In the abelian case we can adapt arguments of Matomaki and Shao, which uses Cauchy-Schwarz and the equidistribution properties of polynomials to obtain good bounds unless ${e(P(n))}$ is “major arc” in the sense that it resembles (or “pretends to be”) ${\chi(n) n^{it}}$ for some Dirichlet character ${\chi}$ and some frequency ${t}$, but in this case one can use classical multiplicative methods to control the correlation. It turns out that the non-abelian case can be treated similarly. After applying Cauchy-Schwarz, one ends up analyzing the equidistribution of the four-variable polynomial sequence

$\displaystyle (n,m,n',m') \mapsto (g(nm)\Gamma, g(n'm)\Gamma, g(nm') \Gamma, g(n'm'\Gamma))$

as ${n,m,n',m'}$ range in various dyadic intervals. Using the known multidimensional equidistribution theory of polynomial maps in nilmanifolds, one can eventually show in the non-abelian case that this sequence either has enough equidistribution to give cancellation, or else the nilsequence involved can be replaced with one from a lower dimensional nilmanifold, in which case one can apply an induction hypothesis.

For the type ${I_2}$ sum, a model sum to study is

$\displaystyle \sum_{X < n \leq X+H} d_2(n) e(\alpha n)$

which one can expand as

$\displaystyle \sum_{n,m: X < nm \leq X+H} e(\alpha nm).$

We experimented with a number of ways to treat this type of sum (including automorphic form methods, or methods based on the Voronoi formula or van der Corput’s inequality), but somewhat to our surprise, the most efficient approach was an elementary one, in which one uses the Dirichlet approximation theorem to decompose the hyperbolic region ${\{ (n,m) \in {\bf N}^2: X < nm \leq X+H \}}$ into a number of arithmetic progressions, and then uses equidistribution theory to establish cancellation of sequences such as ${e(\alpha nm)}$ on the majority of these progressions. As it turns out, this strategy works well in the regime ${H > X^{1/3+\varepsilon}}$ unless the nilsequence involved is “major arc”, but the latter case is treatable by existing methods as discussed previously; this is why the ${\theta}$ exponent for our ${d_2}$ result can be as low as ${1/3}$.

In a sequel to this paper (currently in preparation), we will obtain analogous results for almost all intervals ${(x,x+H]}$ with ${x}$ in the range ${[X,2X]}$, in which we will be able to lower ${\theta}$ all the way to ${0}$.

Just a brief announcement that the AMS is now accepting (until June 30) nominations for the 2023 Joseph L. Doob Prize, which recognizes a single, relatively recent, outstanding research book that makes a seminal contribution to the research literature, reflects the highest standards of research exposition, and promises to have a deep and long-term impact in its area. The book must have been published within the six calendar years preceding the year in which it is nominated. Books may be nominated by members of the Society, by members of the selection committee, by members of AMS editorial committees, or by publishers.  (I am currently on the committee for this prize.)  A list of previous winners may be found here.  The nomination procedure may be found at the bottom of this page.

Just a brief update to the previous post. Gerhard Paseman and I have now set up a web site for the Short Communication Satellite (SCS) for the virtual International Congress of Mathematicians (ICM), which will be an experimental independent online satellite event in which short communications on topics relevant to one or two of the sections of the ICM can be submitted, reviewed by peers, and (if appropriate for the SCS event) displayed in a virtual “poster room” during the Congress on July 6-14 (which, by the way, has recently released its schedule and list of speakers). Our plan is to open the registration for this event on April 5, and start taking submissions on April 20; we are also currently accepting any expressions of interest in helping out with the event, for instance by serving as a reviewer. For more information about the event, please see the overview page, the guidelines page, and the FAQ page of the web site. As viewers will see, the web site is still somewhat under construction, but will be updated as we move closer to the actual Congress.

UPDATE: for readers who have difficulty accessing the links above, here are backup copies of the overview page and guidelines page.

[As with previous posts regarding ICM satellite events, I am authoring this post as an individual, and not in my capacity as chair of the ICM Structure Committee, which does not have any advisory or supervisory role over ICM satellite events – T.]

One of the traditional features of the International Congress of Mathematicians are the “short communications”, organized by the local organizing committee (as opposed to the International Mathematical Union), which allows participants at the congress to present either a poster or a short talk (typically 15 minutes or so) during the congress. For instance, here are the titles of the short communications and posters from the 2018 ICM, and here are the short communications and posters from the 2014 ICM. While not as high profile as other events of the ICM such as the plenary lectures, sectional lectures, or prize lectures, the short communications and posters can offer a chance for academics from a quite diverse range of institutions worldwide (and even a few independent mathematicians) be able to present their work to a mathematical audience.

There has been some volunteer effort to try to replicate some form of this event for the upcoming virtual ICM this July as a semi-official “satellite” event of the virtual ICM; it would technically not be part of the core ICM program, but I expect it would be recognized by the IMU as an independently organized satellite. Due to lack of time, funding, and technical expertise, we will not be able to offer any video, audio, or physical hosting for such an event, but we believe that a modest virtual event is possible involving submission of either a PDF “poster” or a PDF “slide deck”, together with other metadata such as author, title, abstract, and external links (e.g., to an externally hosted video presentation of the poster or slides), with some reviewing to ensure a certain minimum level of quality of approved submissions (we are thinking about setting guidelines similar to those required for a submission to the arXiv), and some ability to offer feedback on each submission. (For instance, we are thinking of hosting the event on a MediaWiki, with each communication being given a separate page which can attract discussion and responses to queries from the author(s).) We are also thinking of grouping the poster or slides according to the 20 sections of the 2022 ICM. We would then promote these communications during the virtual ICM, for instance on this blog or on the unofficial ICM Discord. Perhaps some of the other proposed online experiments for virtual events discussed in this previous post could also be implemented experimentally on this satellite event to demonstrate proof-of-concept. (If the event turns out to be successful, one could hope that it could serve as a pilot project for a longer-term and better funded platform for virtual short communications to accompany other conferences, but for now we would like to focus just on the virtual ICM satellite event.)

As one of our first actions, we would like to survey the level of interest in such an event, both among potential submitters of posters or slides, and also potential volunteers to help organize the event (in particular we may need some assistance in manually reviewing submissions, though we do plan to enlist peer reviewers by requiring submitters to rate and comment on other submissions in the same section). We have therefore created a form to (very unscientifically) gauge this level in order to decide on the scale of this project (or whether to attempt it at all). All readers of this blog are welcome to offer feedback through that form, or as a comment to this blog.

EDIT (Mar 29): a formal announcement will be made soon, but you can view a draft of the announcement here.

[Note: while I happen to be the chair of the ICM Structure Committee, I am authoring this blog post as an individual, and not as a representative of that committee or of the IMU, as they do not have jurisdiction over satellite conferences. -T.]

The International Mathematical Union (IMU) has just released some updates on the status of the 2022 International Congress of Mathematicians (ICM), which was discussed in this previous post:

• The General Assembly will take place in person in Helsinki, Finland, on July 3-4.
• The IMU award ceremony will be held in the same location on July 5.
• The ICM will take place virtually (with free participation) during the hours 9:00-18:00 CEST of July 6-14, with talks either live or pre-recorded according to speaker preference.

Due to the limited time and resources available, the core ICM program will be kept to the bare essentials; the lectures will be streamed but without opportunity for questions or other audience feedback. However, the IMU encourages grassroots efforts to supplement the core program with additional satellite activities, both “traditional” and “non-traditional”. Examples of such satellite activities include:

A more updated list of these events can be found here.

I will also mention the second Azat Miftakov Days, which are unaffiliated with the ICM but held concurrently with the beginning of the congress (and the prize ceremony).

Strictly speaking, satellite events are not officially part of the Congress, and not directly subject to IMU oversight; they also receive no funding or support from the IMU, other than sharing of basic logistical information, and recognition of the satellite conferences on the ICM web site. Thus this (very exceptional and sui generis) congress will differ in format from previous congresses, in that many of the features of the congress that traditionally were managed by the local organizing committee will be outsourced this one time to the broader mathematical community in a highly decentralized fashion.

In order to coordinate the various grassroots efforts to establish such satellite events, Alexei Borodin, Martin Hairer, and myself have set up a satellite coordination group to share information and advice on these events. (It should be noted that while Alexei, Martin and myself serve on either the structure committee or the program committee of the ICM, we are acting here as individuals rather than as official representatives of the IMU.) Anyone who is interested in organizing, hosting, or supporting such an event is welcome to ask to join the group (though I should say that most of the discussion concerns boring logistical issues). Readers are also welcome to discuss broader issues concerning satellites, or the congress as a whole, in the comments to this post. I will also use this space to announce details of satellite events as they become available (most are currently still only in the early stages of planning).

Jan Grebik, Rachel Greenfeld, Vaclav Rozhon and I have just uploaded to the arXiv our preprint “Measurable tilings by abelian group actions“. This paper is related to an earlier paper of Rachel Greenfeld and myself concerning tilings of lattices ${{\bf Z}^d}$, but now we consider the more general situation of tiling a measure space ${X}$ by a tile ${A \subset X}$ shifted by a finite subset ${F}$ of shifts of an abelian group ${G = (G,+)}$ that acts in a measure-preserving (or at least quasi-measure-preserving) fashion on ${X}$. For instance, ${X}$ could be a torus ${{\bf T}^d = {\bf R}^d/{\bf Z}^d}$, ${A}$ could be a positive measure subset of that torus, and ${G}$ could be the group ${{\bf R}^d}$, acting on ${X}$ by translation.

If ${F}$ is a finite subset of ${G}$ with the property that the translates ${f+A}$, ${f \in F}$ of ${A \subset X}$ partition ${X}$ up to null sets, we write ${F \oplus A =_{a.e.} X}$, and refer to this as a measurable tiling of ${X}$ by ${A}$ (with tiling set ${F}$). For instance, if ${X}$ is the torus ${{\bf T}^2}$, we can create a measurable tiling with ${A = [0,1/2]^2 \hbox{ mod } {\bf Z}^2}$ and ${F = \{0,1/2\}^2}$. Our main results are the following:

• By modifying arguments from previous papers (including the one with Greenfeld mentioned above), we can establish the following “dilation lemma”: a measurable tiling ${F \oplus A =_{a.e.} X}$ automatically implies further measurable tilings ${rF \oplus A =_{a.e.} X}$, whenever ${r}$ is an integer coprime to all primes up to the cardinality ${\# F}$ of ${F}$.
• By averaging the above dilation lemma, we can also establish a “structure theorem” that decomposes the indicator function ${1_A}$ of ${A}$ into components, each of which are invariant with respect to a certain shift in ${G}$. We can establish this theorem in the case of measure-preserving actions on probability spaces via the ergodic theorem, but one can also generalize to other settings by using the device of “measurable medial means” (which relates to the concept of a universally measurable set).
• By applying this structure theorem, we can show that all measurable tilings ${F \oplus A = {\bf T}^1}$ of the one-dimensional torus ${{\bf T}^1}$ are rational, in the sense that ${F}$ lies in a coset of the rationals ${{\bf Q} = {\bf Q}^1}$. This answers a recent conjecture of Conley, Grebik, and Pikhurko; we also give an alternate proof of this conjecture using some previous results of Lagarias and Wang.
• For tilings ${F \oplus A = {\bf T}^d}$ of higher-dimensional tori, the tiling need not be rational. However, we can show that we can “slide” the tiling to be rational by giving each translate ${f + A}$ of ${A}$ a “velocity” ${v_f \in {\bf R}^d}$, and for every time ${t}$, the translates ${f + tv_f + A}$ still form a partition of ${{\bf T}^d}$ modulo null sets, and at time ${t=1}$ the tiling becomes rational. In particular, if a set ${A}$ can tile a torus in an irrational fashion, then it must also be able to tile the torus in a rational fashion.
• In the two-dimensional case ${d=2}$ one can arrange matters so that all the velocities ${v_f}$ are parallel. If we furthermore assume that the tile ${A}$ is connected, we can also show that the union of all the translates ${f+A}$ with a common velocity ${v_f = v}$ form a ${v}$-invariant subset of the torus.
• Finally, we show that tilings ${F \oplus A = {\bf Z}^d \times G}$ of a finitely generated discrete group ${{\bf Z}^d \times G}$, with ${G}$ a finite group, cannot be constructed in a “local” fashion (we formalize this probabilistically using the notion of a “factor of iid process”) unless the tile ${F}$ is contained in a single coset of ${\{0\} \times G}$. (Nonabelian local tilings, for instance of the sphere by rotations, are of interest due to connections with the Banach-Tarski paradox; see the aforementioned paper of Conley, Grebik, and Pikhurko. Unfortunately, our methods seem to break down completely in the nonabelian case.)

In this post I would like to collect a list of resources that are available to mathematicians displaced by conflict. Here are some general resources:

There are also resources specific to the current crisis:

Finally, there are a number of institutes and departments who are willing to extend visiting or adjunct positions to such displaced mathematicians:

If readers have other such resources to contribute (or to update the ones already listed), please do so in the comments and I will modify the above lists as appropriate.

As with the previous post, any purely political comment not focused on such resources will be considered off-topic and thus subject to deletion.