Next quarter, starting March 30, I will be teaching “Math 247B: Classical Fourier Analysis” here at UCLA.  (The course should more accurately be named “Modern real-variable harmonic analysis”, but we have not gotten around to implementing such a name change.) This class (a continuation of Math 247A from previous quarter, taught by my colleague, Monica Visan) will cover the following topics:

• Restriction theory and Strichartz estimates
• Decoupling estimates and applications
• Paraproducts; time frequency analysis; Carleson’s theorem

As usual, lecture notes will be made available on this blog.

Unlike previous courses, this one will be given online as part of UCLA’s social distancing efforts.  In particular, the course will be open to anyone with an internet connection (no UCLA affiliation is required), though non-UCLA participants will not have full access to all aspects of the course, and there is the possibility that some restrictions on participation may be imposed if there are significant disruptions to class activity.  For more information, see the course descriptionUPDATE: due to time limitations, I will not be able to respond to personal email inquiries about this class from non-UCLA participants in the course.  Please use the comment thread to this blog post for such inquiries.  I will also update the course description throughout the course to reflect the latest information about the course, both for UCLA students enrolled in the course and for non-UCLA participants.

Just a short note that the memorial article “Analysis and applications: The mathematical work of Elias Stein” has just been published in the Bulletin of the American Mathematical Society.  This article was a collective effort led by Charlie Fefferman, Alex Ionescu, Steve Wainger and myself to describe the various mathematical contributions of Elias Stein, who passed away in December 2018; it also features contributions from Loredana Lanzani, Akos Magyar, Mariusz Mirek, Alexander Nagel, Duong Phong, Lillian Pierce, Fulvio Ricci, Christopher Sogge, and Brian Street.  (My contribution was mostly focused on Stein’s contribution to restriction theory.)

As part of social distancing efforts to slow down the spread of the novel coronavirus, several universities have now transitioned, or begun transitioning, to online teaching models.  (My home university of UCLA has not yet done so, but is certainly considering the option.  UPDATE: we are transitioning.)  As a consequence, I thought it might be an appropriate time to start a discussion on the pros and cons of various technologies for giving talks and lectures online, particularly in the context of mathematical talks where there may be special considerations coming for instance for the need to do mathematical computations on a blackboard or equivalent.  My own institution is for instance recommending the use of Zoom for lectures and Respondus for giving finals, and has a limited number of classrooms set up for high quality video and audio casting, as well as a platform for discussion forums and course materials for each class.  For smaller meetings, such as one-on-one meetings with graduate students, one can of course improvise using off-the-shelf tools such as Skype.  I would be interested in knowing what other options are available and what success lecturers have had with them.

The same goes for giving mathematical talks.  I learned recently (from Jordan Ellenberg) that Rachel Preis has recently launched a “virtual math seminar on open conjectures in number theory in arithmetic geometry” (VaNTAGe) that is run using the BlueJeans platform.  And for many years there has been a regular joint math seminar between UC Berkeley, U. Paris-Nord, U. Zurich, and U. Bonn (see e.g., this calendar), and nowadays many mathematical institutes stream their talks or at least videotape them to place them online later.  Our own department does not have a dedicated lecture hall for videocasting, so I would be interested in knowing of any successful ways to improvise such casting with more portable technology. (Skype in principle could work here, but I have found this to be clunky even for smaller meetings involving just a handful of partcipants.)

EDIT: in addition to lectures and talks, it would also be topical to discuss online options for office hours, midterms, and final exams.

In the modern theory of additive combinatorics, a large role is played by the Gowers uniformity norms ${\|f\|_{U^k(G)}}$, where ${k \geq 1}$, ${G = (G,+)}$ is a finite abelian group, and ${f: G \rightarrow {\bf C}}$ is a function (one can also consider these norms in finite approximate groups such as ${[N] = \{1,\dots,N\}}$ instead of finite groups, but we will focus on the group case here for simplicity). These norms can be defined by the formula

$\displaystyle \|f\|_{U^k(G)} := (\mathop{\bf E}_{x,h_1,\dots,h_k \in G} \Delta_{h_1} \dots \Delta_{h_k} f(x))^{1/2^k}$

where we use the averaging notation

$\displaystyle \mathop{\bf E}_{x \in A} f(x) := \frac{1}{|A|} \sum_{x \in A} f(x)$

for any non-empty finite set ${A}$ (with ${|A|}$ denoting the cardinality of ${A}$), and ${\Delta_h}$ is the multiplicative discrete derivative operator

$\displaystyle \Delta_h f(x) := f(x+h) \overline{f(x)}.$

One reason why these norms play an important role is that they control various multilinear averages. We give two sample examples here:

Proposition 1 Let ${G = {\bf Z}/N{\bf Z}}$.

• (i) If ${a_1,\dots,a_k}$ are distinct elements of ${G}$ for some ${k \geq 2}$, and ${f_1,\dots,f_k: G \rightarrow {\bf C}}$ are ${1}$-bounded functions (thus ${|f_j(x)| \leq 1}$ for all ${j=1,\dots,k}$ and ${x \in G}$), then

$\displaystyle \mathop{\bf E}_{x, h \in G} f_1(x+a_1 h) \dots f_k(x+a_k h) \leq \|f_i\|_{U^{k-1}(G)} \ \ \ \ \ (1)$

for any ${i=1,\dots,k}$.

• (ii) If ${f_1,f_2,f_3: G \rightarrow {\bf C}}$ are ${1}$-bounded, then one has

$\displaystyle \mathop{\bf E}_{x, h \in G} f_1(x) f_2(x+h) f_3(x+h^2) \ll \|f_3\|_{U^4(G)} + N^{-1/4}.$

We establish these claims a little later in this post.

In some more recent literature (e.g., this paper of Conlon, Fox, and Zhao), the role of Gowers norms have been replaced by (generalisations) of the cut norm, a concept originating from graph theory. In this blog post, it will be convenient to define these cut norms in the language of probability theory (using boldface to denote random variables).

Definition 2 (Cut norm) Let ${{\bf X}_1,\dots,{\bf X}_k, {\bf Y}_1,\dots,{\bf Y}_l}$ be independent random variables with ${k,l \geq 0}$; to avoid minor technicalities we assume that these random variables are discrete and take values in a finite set. Given a random variable ${{\bf F} = F( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}$ of these independent random variables, we define the cut norm

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} := \sup | \mathop{\bf E} {\bf F} {\bf B}_1 \dots {\bf B}_k |$

where the supremum ranges over all choices ${{\bf B}_1,\dots,{\bf B}_k}$ of random variables ${{\bf B}_i = B_i( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}$ that are ${1}$-bounded (thus ${|{\bf B}_i| \leq 1}$ surely), and such that ${{\bf B}_i}$ does not depend on ${{\bf X}_i}$.

If ${l=0}$, we abbreviate ${\| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}}$ as ${\| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k )}}$.

Strictly speaking, the cut norm is only a cut semi-norm when ${k=0,1}$, but we will abuse notation by referring to it as a norm nevertheless.

Example 3 If ${G = (V_1,V_2,E)}$ is a bipartite graph, and ${\mathbf{v_1}}$, ${\mathbf{v_2}}$ are independent random variables chosen uniformly from ${V_1,V_2}$ respectively, then

$\displaystyle \| 1_E(\mathbf{v_1},\mathbf{v_2}) \|_{\mathrm{CUT}(\mathbf{v_1}, \mathbf{v_2})}$

$\displaystyle = \sup_{\|f\|_\infty, \|g\|_\infty \leq 1} |\mathop{\bf E}_{v_1 \in V_1, v_2 \in V_2} 1_E(v_1,v_2) f(v_1) g(v_2)|$

where the supremum ranges over all ${1}$-bounded functions ${f: V_1 \rightarrow [-1,1]}$, ${g: V_2 \rightarrow [-1,1]}$. The right hand side is essentially the cut norm of the graph ${G}$, as defined for instance by Frieze and Kannan.

The cut norm is basically an expectation when ${k=0,1}$:

Example 4 If ${k=0}$, we see from definition that

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( ; {\bf Y}_1,\dots,{\bf Y}_l )} =| \mathop{\bf E} {\bf F} |.$

If ${k=1}$, one easily checks that

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}; {\bf Y}_1,\dots,{\bf Y}_l )} = \mathop{\bf E} | \mathop{\bf E}_{\bf X} {\bf F} |,$

where ${\mathop{\bf E}_{\bf X} {\bf F} = \mathop{\bf E}( {\bf F} | {\bf Y}_1,\dots,{\bf Y}_l )}$ is the conditional expectation of ${{\bf F}}$ to the ${\sigma}$-algebra generated by all the variables other than ${{\bf X}}$, i.e., the ${\sigma}$-algebra generated by ${{\bf Y}_1,\dots,{\bf Y}_l}$. In particular, if ${{\bf X}, {\bf Y}_1,\dots,{\bf Y}_l}$ are independent random variables drawn uniformly from ${X,Y_1,\dots,Y_l}$ respectively, then

$\displaystyle \| F( {\bf X}; {\bf Y}_1,\dots, {\bf Y}_l) \|_{\mathrm{CUT}( {\bf X}; {\bf Y}_1,\dots,{\bf Y}_l )}$

$\displaystyle = \mathop{\bf E}_{y_1 \in Y_1,\dots, y_l \in Y_l} |\mathop{\bf E}_{x \in X} F(x; y_1,\dots,y_l)|.$

Here are some basic properties of the cut norm:

Lemma 5 (Basic properties of cut norm) Let ${{\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l}$ be independent discrete random variables, and ${{\bf F} = F({\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l)}$ a function of these variables.

• (i) (Permutation invariance) The cut norm ${\| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}}$ is invariant with respect to permutations of the ${{\bf X}_1,\dots,{\bf X}_k}$, or permutations of the ${{\bf Y}_1,\dots,{\bf Y}_l}$.
• (ii) (Conditioning) One has

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} = \mathop{\bf E} \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k )}$

where on the right-hand side we view, for each realisation ${y_1,\dots,y_l}$ of ${{\bf Y}_1,\dots,{\bf Y}_l}$, ${{\bf F}}$ as a function ${F( {\bf X}_1,\dots,{\bf X}_k; y_1,\dots,y_l)}$ of the random variables ${{\bf X}_1,\dots, {\bf X}_k}$ alone, thus the right-hand side may be expanded as

$\displaystyle \sum_{y_1,\dots,y_l} \| F( {\bf X}_1,\dots,{\bf X}_k; y_1,\dots,y_l) \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k )}$

$\displaystyle \times \mathop{\bf P}( Y_1=y_1,\dots,Y_l=y_l).$

• (iii) (Monotonicity) If ${k \geq 1}$, we have

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} \geq \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_{k-1}; {\bf X}_k, {\bf Y}_1,\dots,{\bf Y}_l )}.$

• (iv) (Multiplicative invariances) If ${{\bf B} = B({\bf X}_1,\dots,{\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l)}$ is a ${1}$-bounded function that does not depend on one of the ${{\bf X}_i}$, then

$\displaystyle \| {\bf B} {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} \leq \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}.$

In particular, if we additionally assume ${|{\bf B}|=1}$, then

$\displaystyle \| {\bf B} {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} = \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}.$

• (v) (Cauchy-Schwarz) If ${k \geq 1}$, one has

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} \leq \| \Box_{{\bf X}_1, {\bf X}'_1} {\bf F} \|_{\mathrm{CUT}( {\bf X}_2, \dots, {\bf X}_k; {\bf X}_1, {\bf X}'_1, {\bf Y}_1,\dots,{\bf Y}_l )}^{1/2}$

where ${{\bf X}'_1}$ is a copy of ${{\bf X}_1}$ that is independent of ${{\bf X}_1,\dots,{\bf X}_k,{\bf Y}_1,\dots,{\bf Y}_l}$ and ${\Box_{{\bf X}_1, {\bf X}'_1} {\bf F}}$ is the random variable

$\displaystyle \Box_{{\bf X}_1, {\bf X}'_1} {\bf F} := F( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )$

$\displaystyle \times \overline{F}( {\bf X}'_1, {\bf X}_2, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l ).$

• (vi) (Averaging) If ${k \geq 1}$ and ${{\bf F} = \mathop{\bf E}_{\bf Z} {\bf F}_{\bf Z}}$, where ${{\bf Z}}$ is another random variable independent of ${{\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l}$, and ${{\bf F}_{\bf Z} = F_{\bf Z}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}$ is a random variable depending on both ${{\bf Z}}$ and ${{\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l}$, then

$\displaystyle \| {\bf F} \|_{\mathrm{CUT}( {\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )} \leq \| {\bf F}_{\bf Z} \|_{\mathrm{CUT}( ({\bf X}_1, {\bf Z}), {\bf X}_2, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l )}$

Proof: The claims (i), (ii) are clear from expanding out all the definitions. The claim (iii) also easily follows from the definitions (the left-hand side involves a supremum over a more general class of multipliers ${{\bf B}_1,\dots,{\bf B}_{k}}$, while the right-hand side omits the ${{\bf B}_k}$ multiplier), as does (iv) (the multiplier ${{\bf B}}$ can be absorbed into one of the multipliers in the definition of the cut norm). The claim (vi) follows by expanding out the definitions, and observing that all of the terms in the supremum appearing in the left-hand side also appear as terms in the supremum on the right-hand side. It remains to prove (v). By definition, the left-hand side is the supremum over all quantities of the form

$\displaystyle |{\bf E} {\bf F} {\bf B}_1 \dots {\bf B}_k|$

where the ${{\bf B}_i}$ are ${1}$-bounded functions of ${{\bf X}_1, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l}$ that do not depend on ${{\bf X}_i}$. We average out in the ${{\bf X}_1}$ direction (that is, we condition out the variables ${{\bf X}_2, \dots, {\bf X}_k; {\bf Y}_1,\dots,{\bf Y}_l}$), and pull out the factor ${{\bf B}_1}$ (which does not depend on ${{\bf X}_1}$), to write this as

$\displaystyle |{\bf E} {\bf B}_1 {\bf E}_{{\bf X}_1}( {\bf F} {\bf B}_2 \dots {\bf B}_k )|,$

which by Cauchy-Schwarz is bounded by

$\displaystyle ( |{\bf E} |{\bf E}_{{\bf X}_1}( {\bf F} {\bf B}_2 \dots {\bf B}_k )|^2)^{1/2},$

which can be expanded using the copy ${{\bf X}_1}$ as

$\displaystyle |{\bf E} \Box_{{\bf X}_1,{\bf X}'_1} ({\bf F} {\bf B}_2 \dots {\bf B}_k) |^{1/2}.$

Expanding

$\displaystyle \Box_{{\bf X}_1,{\bf X}'_1} ({\bf F} {\bf B}_2 \dots {\bf B}_k) = (\Box_{{\bf X}_1,{\bf X}'_1} {\bf F}) (\Box_{{\bf X}_1,{\bf X}'_1} {\bf B}_2) \dots (\Box_{{\bf X}_1,{\bf X}'_1} {\bf B}_k)$

and noting that each ${\Box_{{\bf X}_1,{\bf X}'_1} {\bf B}_i}$ is ${1}$-bounded and independent of ${{\bf X}_i}$ for ${i=2,\dots,k}$, we obtain the claim. $\Box$

Now we can relate the cut norm to Gowers uniformity norms:

Lemma 6 Let ${G}$ be a finite abelian group, let ${{\bf x}, {\bf h}_1,\dots,{\bf h}_k}$ be independent random variables uniformly drawn from ${G}$ for some ${k \geq 0}$, and let ${f: G \rightarrow {\bf C}}$. Then

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h}_1,\dots,{\bf h}_k, {\bf x} )} \leq \|f\|_{U^{k+1}(G)} \ \ \ \ \ (2)$

and similarly (if ${k \geq 1}$)

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h}_1,\dots,{\bf h}_k; {\bf x} )} \leq \|f\|_{U^{k}(G)} \ \ \ \ \ (3)$

If ${f}$ is additionally assumed to be ${1}$-bounded, we have the converse inequalities

$\displaystyle \|f\|_{U^{k+1}(G)}^{2^{k+1}} \leq \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h}_1,\dots,{\bf h}_k, {\bf x} )} \ \ \ \ \ (4)$

and (if ${k \geq 1}$)

$\displaystyle \|f\|_{U^{k}(G)}^{2^{k}} \leq \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h}_1,\dots,{\bf h}_k; {\bf x} )}. \ \ \ \ \ (5)$

Proof: Applying Lemma 5(v) ${k}$ times, we can bound

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h_1},\dots,{\bf h_k}, {\bf x} )}$

by

$\displaystyle \| \Box_{{\bf h}_k,{\bf h}'_k} \dots \Box_{{\bf h}_1,{\bf h}'_1} (f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k)) \|_{\mathrm{CUT}( {\bf x}; {\bf h}_1, {\bf h}'_1, \dots, {\bf h}_k, {\bf h}'_k )}^{1/2^k} \ \ \ \ \ (6)$

where ${{\bf h}'_1,\dots,{\bf h}'_k}$ are independent copies of ${{\bf h}_1,\dots,{\bf h}_k}$ that are also independent of ${{\bf x}}$. The expression inside the norm can also be written as

$\displaystyle \Delta_{{\bf h}_k - {\bf h}'_k} \dots \Delta_{{\bf h}_1 - {\bf h}'_1} f({\bf x} + {\bf h}'_1 + \dots + {\bf h}'_k)$

so by Example 4 one can write (6) as

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k,h'_1,\dots,h'_k \in G} |\mathop{\bf E}_{x \in G} \Delta_{h_k - h'_k} \dots \Delta_{h_1 - h'_1} f(x+h'_1+\dots+h'_k)||^{1/2^k}$

which after some change of variables simplifies to

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k \in G} |\mathop{\bf E}_{x \in G} \Delta_{h_k} \dots \Delta_{h_1} f(x)||^{1/2^k}$

which by Cauchy-Schwarz is bounded by

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k \in G} |\mathop{\bf E}_{x \in G} \Delta_{h_k} \dots \Delta_{h_1} f(x)|^2|^{1/2^{k+1}}$

which one can rearrange as

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k,h_{k+1},x \in G} \Delta_{h_{k+1}} \Delta_{h_k} \dots \Delta_{h_1} f(x)|^{1/2^{k+1}}$

giving (2). A similar argument bounds

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_k) \|_{\mathrm{CUT}( {\bf h_1},\dots,{\bf h_k}; {\bf x} )}$

by

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k \in G} \mathop{\bf E}_{x \in G} \Delta_{h_k} \dots \Delta_{h_1} f(x)|^{1/2^k}$

which gives (3).

For (4), we can reverse the above steps and expand ${\|f\|_{U^{k+1}(G)}^{2^{k+1}}}$ as

$\displaystyle \mathop{\bf E}_{h_1,\dots,h_k \in G} |\mathop{\bf E}_{x \in G} \Delta_{h_k} \dots \Delta_{h_1} f(x)|^2$

which we can write as

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k \in G} b(h_1,\dots,h_k) \mathop{\bf E}_{x \in G} \Delta_{h_k} \dots \Delta_{h_1} f(x)|$

for some ${1}$-bounded function ${b}$. This can in turn be expanded as

$\displaystyle |\mathop{\bf E}_{h_1,\dots,h_k,x \in G} f(x+h_1+\dots+h_k) b(h_1,\dots,h_k) \prod_{i=1}^k b_i(x,h_1,\dots,h_k)|$

for some ${1}$-bounded functions ${b_i}$ that do not depend on ${h_i}$. By Example 4, this can be written as

$\displaystyle \| f({\bf x} + {\bf h_1}+\dots+{\bf h}_k) b({\bf h}_1,\dots,{\bf h}_k) \prod_{i=1}^k b_i(x,h_1,\dots,h_k) \|_{\mathrm{CUT}(; {\bf h}_1,\dots,{\bf h}_k, {\bf x})}$

which by several applications of Theorem 5(iii) and then Theorem 5(iv) can be bounded by

$\displaystyle \| f({\bf x} + {\bf h_1}+\dots+{\bf h}_k) \|_{\mathrm{CUT}( {\bf h}_1,\dots,{\bf h}_k, {\bf x})},$

giving (4). A similar argument gives (5). $\Box$

Now we can prove Proposition 1. We begin with part (i). By permutation we may assume ${i=k}$, then by translation we may assume ${a_k=0}$. Replacing ${x}$ by ${x+h_1+\dots+h_{k-1}}$ and ${h}$ by ${h - a_1^{-1} h_1 - \dots - a_{k-1}^{-1} h_{k-1}}$, we can write the left-hand side of (1) as

$\displaystyle \mathop{\bf E}_{x,h,h_1,\dots,h_{k-1} \in G} f_k(x+h_1+\dots+h_{k-1}) \prod_{i=1}^{k-1} b_i(x,h,h_1,\dots,h_{k-1})$

where

$\displaystyle b_i(x,h,h_1,\dots,h_{k-1})$

$\displaystyle := f_i( x + h_1+\dots+h_{k-1}+ a_i(h - a_1^{-1} h_1 - \dots - a_k^{-1} h_{k-1}))$

is a ${1}$-bounded function that does not depend on ${h_i}$. Taking ${{\bf x}, {\bf h}, {\bf h}_1,\dots,{\bf h}_k}$ to be independent random variables drawn uniformly from ${G}$, the left-hand side of (1) can then be written as

$\displaystyle \mathop{\bf E} f_k({\bf x}+{\bf h}_1+\dots+{\bf h}_{k-1}) \prod_{i=1}^{k-1} b_i({\bf x},{\bf h},{\bf h}_1,\dots,{\bf h}_{k-1})$

which by Example 4 is bounded in magnitude by

$\displaystyle \| f_k({\bf x}+{\bf h}_1+\dots+{\bf h}_{k-1}) \prod_{i=1}^{k-1} b_i({\bf x},{\bf h},{\bf h}_1,\dots,{\bf h}_{k-1}) \|_{\mathrm{CUT}(; {\bf h}_1,\dots,{\bf h}_{k-1}, {\bf x}, {\bf h})}.$

After many applications of Lemma 5(iii), (iv), this is bounded by

$\displaystyle \| f_k({\bf x}+{\bf h_1}+\dots+{\bf h_{k-1}}) \|_{\mathrm{CUT}({\bf h}_1,\dots,{\bf h}_{k-1}; {\bf x}, {\bf h})}$

By Lemma 5(ii) we may drop the ${{\bf h}}$ variable, and then the claim follows from Lemma 6.

For part (ii), we replace ${x}$ by ${x+a-h^2}$ and ${h}$ by ${h-a+b}$ to write the left-hand side as

$\displaystyle \mathop{\bf E}_{x, a,b,h \in G} f_1(x+a-h^2) f_2(x+h+b-h^2) f_3(x+a+(h-a+b)^2-h^2);$

the point here is that the first factor does not involve ${b}$, the second factor does not involve ${a}$, and the third factor has no quadratic terms in ${h}$. Letting ${{\bf x}, {\bf a}, {\bf b}, {\bf h}}$ be independent variables drawn uniformly from ${G}$, we can use Example 4 to bound this in magnitude by

$\displaystyle \| f_1({\bf x}+{\bf a}-{\bf h}^2) f_2({\bf x}+{\bf h}+{\bf b}-{\bf h}^2)$

$\displaystyle f_3( {\bf x}+{\bf a}+({\bf h}-{\bf a}+{\bf b})^2-{\bf h}^2 ) \|_{\mathrm{CUT}(; {\bf x}, {\bf a}, {\bf b}, {\bf h})}$

which by Lemma 5(i),(iii),(iv) is bounded by

$\displaystyle \| f_3( {\bf x}+{\bf a}+({\bf h}-{\bf a}+{\bf b})^2 - {\bf h}^2 ) \|_{\mathrm{CUT}({\bf a}, {\bf b}; {\bf x}, {\bf h})}$

and then by Lemma 5(v) we may bound this by

$\displaystyle \| \Box_{{\bf a}, {\bf a}'} \Box_{{\bf b}, {\bf b}'} f_3( {\bf x}+{\bf a}+({\bf h}-{\bf a}+{\bf b})^2 - {\bf h}^2 ) \|_{\mathrm{CUT}(;{\bf a}, {\bf a}', {\bf b}, {\bf b}', {\bf x}, {\bf h})}^{1/4}$

which by Example 4 is

$\displaystyle |\mathop{\bf E} \Box_{{\bf a}, {\bf a}'} \Box_{{\bf b}, {\bf b}'} f_3( {\bf x}+{\bf a}+({\bf h}-{\bf a}+{\bf b})^2 - {\bf h}^2 )|^{1/4}$

Now the expression inside the expectation is the product of four factors, each of which is ${f_3}$ or ${\overline{f}_3}$ applied to an affine form ${{\bf x} + {\bf c} + {\bf a} {\bf h}}$ where ${{\bf c}}$ depends on ${{\bf a}, {\bf a}', {\bf b}, {\bf b}'}$ and ${{\bf a}}$ is one of ${2({\bf b}-{\bf a})}$, ${2({\bf b}'-{\bf a})}$, ${2({\bf b}-{\bf a}')}$, ${2({\bf b}'-{\bf a}')}$. With probability ${1-O(1/N)}$, the four different values of ${{\bf a}}$ are distinct, and then by part (i) we have

$\displaystyle |\mathop{\bf E}(\Box_{{\bf a}, {\bf a}'} \Box_{{\bf b}, {\bf b}'} f_3( {\bf x}+{\bf a}+({\bf h}-{\bf a}+{\bf b})^2 - {\bf h}^2 )|{\bf a}, {\bf a}', {\bf b}, {\bf b}')| \leq \|f_3\|_{U^4({\bf Z}/N{\bf Z})}.$

When they are not distinct, we can instead bound this quantity by ${1}$. Taking expectations in ${{\bf a}, {\bf a}', {\bf b}, {\bf b}'}$, we obtain the claim. $\Box$

The analogue of the inverse ${U^2}$ theorem for cut norms is the following claim (which I learned from Ben Green):

Lemma 7 (${U^2}$-type inverse theorem) Let ${\mathbf{x}, \mathbf{h}}$ be independent random variables drawn from a finite abelian group ${G}$, and let ${f: G \rightarrow {\bf C}}$ be ${1}$-bounded. Then we have

$\displaystyle \| f(\mathbf{x} + \mathbf{h}) \|_{\mathrm{CUT}(\mathbf{x}, \mathbf{h})} = \sup_{\xi \in\hat G} \| f(\mathbf{x}) e(\xi \cdot \mathbf{x}) \|_{\mathrm{CUT}(\mathbf{x})}$

where ${\hat G}$ is the group of homomorphisms ${\xi: x \mapsto \xi \cdot x}$ is a homomorphism from ${G}$ to ${{\bf R}/{\bf Z}}$, and ${e(\theta) := e^{2\pi i \theta}}$.

Proof: Suppose first that ${\| f(\mathbf{x} + \mathbf{h}) \|_{\mathrm{CUT}(\mathbf{x}, \mathbf{h})} > \delta}$ for some ${\delta}$, then by definition

$\displaystyle |\mathop{\bf E}_{x,h \in G} f(x+h) a(x) b(h)| > \delta$

for some ${1}$-bounded ${a,b: G \rightarrow {\bf C}}$. By Fourier expansion, the left-hand side is also

$\displaystyle \sum_{\xi \in \hat G} \hat f(-\xi) \hat a(\xi) \hat b(\xi)$

where ${\hat f(\xi) := \mathop{\bf E}_{x \in G} f(x) e(-\xi \cdot x)}$. From Plancherel’s theorem we have

$\displaystyle \sum_{\xi \in \hat G} |\hat a(\xi)|^2, \sum_{\xi \in \hat G} |\hat b(\xi)|^2 \leq 1$

hence by Hölder’s inequality one has ${|\hat f(-\xi)| > \delta}$ for some ${\xi \in \hat G}$, and hence

$\displaystyle \sup_{\xi \in\hat G} \| f(\mathbf{x}) e(\xi \cdot \mathbf{x}) \|_{\mathrm{CUT}(\mathbf{x})} > \delta. \ \ \ \ \ (7)$

Conversely, suppose (7) holds. Then there is ${\xi \in \hat G}$ such that

$\displaystyle \| f(\mathbf{x}) e(\xi \cdot \mathbf{x}) \|_{\mathrm{CUT}(\mathbf{x})} > \delta$

which on substitution and Example 4 implies

$\displaystyle \| f(\mathbf{x}+\mathbf{h}) e(\xi \cdot (\mathbf{x}+\mathbf{h})) \|_{\mathrm{CUT}(;\mathbf{x}, \mathbf{h})} > \delta.$

The term ${e(\xi \cdot (\mathbf{x}+\mathbf{h}))}$ splits into the product of a factor ${e(\xi \cdot \mathbf{x})}$ not depending on ${\mathbf{h}}$, and a factor ${e(\xi \cdot \mathbf{h})}$ not depending on ${\mathbf{x}}$. Applying Lemma 5(iii), (iv) we conclude that

$\displaystyle \| f(\mathbf{x}+\mathbf{h}) \|_{\mathrm{CUT}(\mathbf{x}, \mathbf{h})} > \delta.$

The claim follows. $\Box$

The higher order inverse theorems are much less trivial (and the optimal quantitative bounds are not currently known). However, there is a useful degree lowering argument, due to Peluse and Prendiville, that can allow one to lower the order of a uniformity norm in some cases. We give a simple version of this argument here:

Lemma 8 (Degree lowering argument, special case) Let ${G}$ be a finite abelian group, let ${Y}$ be a non-empty finite set, and let ${f: G \rightarrow {\bf C}}$ be a function of the form ${f(x) := \mathop{\bf E}_{y \in Y} F_y(x)}$ for some ${1}$-bounded functions ${F_y: G \rightarrow {\bf C}}$ indexed by ${y \in Y}$. Suppose that

$\displaystyle \|f\|_{U^k(G)} \geq \delta$

for some ${k \geq 2}$ and ${0 < \delta \leq 1}$. Then one of the following claims hold (with implied constants allowed to depend on ${k}$):

• (i) (Degree lowering) one has ${\|f\|_{U^{k-1}(G)} \gg \delta^{O(1)}}$.
• (ii) (Non-zero frequency) There exist ${h_1,\dots,h_{k-2} \in G}$ and non-zero ${\xi \in \hat G}$ such that

$\displaystyle |\mathop{\bf E}_{x \in G, y \in Y} \Delta_{h_1} \dots \Delta_{h_{k-2}} F_y(x) e( \xi \cdot x )| \gg \delta^{O(1)}.$

There are more sophisticated versions of this argument in which the frequency ${\xi}$ is “minor arc” rather than “zero frequency”, and then the Gowers norms are localised to suitable large arithmetic progressions; this is implicit in the above-mentioned paper of Peluse and Prendiville.

Proof: One can write

$\displaystyle \|f\|_{U^k(G)}^{2^k} = \mathop{\bf E}_{h_1,\dots,h_{k-2} \in G} \|\Delta_{h_1} \dots \Delta_{h_{k-2}} f \|_{U^2(G)}^4$

and hence we conclude that

$\displaystyle \|\Delta_{h_1} \dots \Delta_{h_{k-2}} f \|_{U^2(G)} \gg \delta^{O(1)}$

for a set ${\Sigma}$ of tuples ${(h_1,\dots,h_{k-2}) \in G^{k-2}}$ of density ${h_1,\dots,h_{k-2}}$. Applying Lemma 6 and Lemma 7, we see that for each such tuple, there exists ${\phi(h_1,\dots,h_{k-2}) \in \hat G}$ such that

$\displaystyle \| \Delta_{h_1} \dots \Delta_{h_{k-2}} f({\bf x}) e( \phi(h_1,\dots,h_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x})} \gg \delta^{O(1)}, \ \ \ \ \ (8)$

where ${{\bf x}}$ is drawn uniformly from ${G}$.

Let us adopt the convention that ${e( \phi( _1,\dots,h_{k-2}) \cdot {\bf x} ) }$ vanishes for ${(h_1,\dots,h_{k-2})}$ not in ${\Sigma}$, then from Lemma 5(ii) we have

$\displaystyle \| \Delta_{{\bf h}_1} \dots \Delta_{{\bf h}_{k-2}} f({\bf x}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x}; {\bf h}_1,\dots, {\bf h}_{k-2})} \gg \delta^{O(1)},$

where ${{\bf h}_1,\dots,{\bf h}_{k-2}}$ are independent random variables drawn uniformly from ${G}$ and also independent of ${{\bf x}}$. By repeated application of Lemma 5(iii) we then have

$\displaystyle \| \Delta_{{\bf h}_1} \dots \Delta_{{\bf h}_{k-2}} f({\bf x}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x},{\bf h}_1,\dots, {\bf h}_{k-2})} \gg \delta^{O(1)}.$

Expanding out ${\Delta_{h_1} \dots \Delta_{h_{k-2}} f({\bf x})}$ and using Lemma 5(iv) repeatedly we conclude that

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x},{\bf h}_1,\dots, {\bf h}_{k-2})} \gg \delta^{O(1)}.$

From definition of ${f}$ we then have

$\displaystyle \| {\bf E}_{y \in Y} F_y({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x},{\bf h}_1,\dots, {\bf h}_{k-2})}$

$\displaystyle \gg \delta^{O(1)}.$

By Lemma 5(vi), we see that the left-hand side is less than

$\displaystyle \| F_{\bf y}({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}(({\bf x}, {\bf y}),{\bf h}_1,\dots, {\bf h}_{k-2})},$

where ${{\bf y}}$ is drawn uniformly from ${Y}$, independently of ${{\bf x}, {\bf h}_1,\dots,{\bf h}_{k-2}}$. By repeated application of Lemma 5(i), (v) repeatedly, we conclude that

$\displaystyle \| \Box_{{\bf h}_1, {\bf h}'_1} \dots \Box_{{\bf h}_{k-2}, {\bf h}'_{k-2}} (F_{\bf y}({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} )) \|_{\mathrm{CUT}(({\bf x},{\bf y}); {\bf h}_1,{\bf h}'_1,\dots, {\bf h}_{k-2}, {\bf h}'_{k-2})} \gg \delta^{O(1)},$

where ${{\bf h}'_1,\dots,{\bf h}'_{k-2}}$ are independent copies of ${{\bf h}_1,\dots,{\bf h}_{k-2}}$ that are also independent of ${{\bf x}}$, ${{\bf y}}$. By Lemma 5(ii) and Example 4 we conclude that

$\displaystyle |\mathop{\bf E}( \Box_{{\bf h}_1, {\bf h}'_1} \dots \Box_{{\bf h}_{k-2}, {\bf h}'_{k-2}} (F_{\bf y}({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) e( \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} )) | {\bf h}_1,{\bf h}'_1,\dots, {\bf h}_{k-2}, {\bf h}'_{k-2}) )| \gg \delta^{O(1)} \ \ \ \ \ (9)$

with probability ${\gg \delta^{O(1)}}$.

The left-hand side can be rewritten as

$\displaystyle |\mathop{\bf E}_{x \in G, y \in Y} \Delta_{{\bf h}_1 - {\bf h}'_1} \dots \Delta_{{\bf h}_{k-2} - {\bf h}'_{k-2}} F_y( x + {\bf h}'_1 + \dots + {\bf h}'_{k-2})$

$\displaystyle e( \delta_{{\bf h}_1, {\bf h}'_1} \dots \delta_{{\bf h}_{k-2}, {\bf h}'_{k-2}} \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot x )|$

where ${\delta_{{\bf h}_1, {\bf h}'_1}}$ is the additive version of ${\Box_{{\bf h}_1, {\bf h}'_1}}$, thus

$\displaystyle \delta_{{\bf h}_1, {\bf h}'_1} \phi({\bf h}_1,\dots,{\bf h}_{k-2}) := \phi({\bf h}_1,\dots,{\bf h}_{k-2}) - \phi({\bf h}'_1,\dots,{\bf h}_{k-2}).$

Translating ${x}$, we can simplify this a little to

$\displaystyle |\mathop{\bf E}_{x \in G, y \in Y} \Delta_{{\bf h}_1 - {\bf h}'_1} \dots \Delta_{{\bf h}_k - {\bf h}'_k} F_y( x ) e( \delta_{{\bf h}_1, {\bf h}'_1} \dots \delta_{{\bf h}_{k-2}, {\bf h}'_{k-2}} \phi({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot x )|$

If the frequency ${\delta_{{\bf h}_1, {\bf h}'_1} \dots \delta_{{\bf h}_{k-2}, {\bf h}'_{k-2}} \phi({\bf h}_1,\dots,{\bf h}_{k-2})}$ is ever non-vanishing in the event (9) then conclusion (ii) applies. We conclude that

$\displaystyle \delta_{{\bf h}_1, {\bf h}'_1} \dots \delta_{{\bf h}_{k-2}, {\bf h}'_{k-2}} \phi({\bf h}_1,\dots,{\bf h}_{k-2}) = 0$

with probability ${\gg \delta^{O(1)}}$. In particular, by the pigeonhole principle, there exist ${h'_1,\dots,h'_{k-2} \in G}$ such that

$\displaystyle \delta_{{\bf h}_1, h'_1} \dots \delta_{{\bf h}_{k-2}, h'_{k-2}} \phi({\bf h}_1,\dots,{\bf h}_{k-2}) = 0$

with probability ${\gg \delta^{O(1)}}$. Expanding this out, we obtain a representation of the form

$\displaystyle \phi({\bf h}_1,\dots,{\bf h}_{k-2}) = \sum_{i=1}^{k-2} \phi_i({\bf h}_1,\dots,{\bf h}_{k-2})$

holding with probability ${\gg \delta^{O(1)}}$, where the ${\phi_i: G^{k-2} \rightarrow {\bf R}/{\bf Z}}$ are functions that do not depend on the ${i^{th}}$ coordinate. From (8) we conclude that

$\displaystyle \| \Delta_{h_1} \dots \Delta_{h_{k-2}} f({\bf x}) e( \sum_{i=1}^{k-2} \phi_i(h_1,\dots,h_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x})} \gg \delta^{O(1)}$

for ${\gg \delta^{O(1)}}$ of the tuples ${(h_1,\dots,h_{k-2}) \in G^{k-2}}$. Thus by Lemma 5(ii)

$\displaystyle \| \Delta_{{\bf h}_1} \dots \Delta_{{\bf h}_{k-2}} f({\bf x}) e( \sum_{i=1}^{k-2} \phi_i({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x}; {\bf h}_1,\dots,{\bf h}_{k-2})} \gg \delta^{O(1)}.$

By repeated application of Lemma 5(iii) we then have

$\displaystyle \| \Delta_{{\bf h}_1} \dots \Delta_{{\bf h}_{k-2}} f({\bf x}) e( \sum_{i=1}^{k-2} \phi_i({\bf h}_1,\dots,{\bf h}_{k-2}) \cdot {\bf x} ) \|_{\mathrm{CUT}({\bf x}, {\bf h}_1,\dots,{\bf h}_{k-2})} \gg \delta^{O(1)}$

and then by repeated application of Lemma 5(iv)

$\displaystyle \| f({\bf x} + {\bf h}_1 + \dots + {\bf h}_{k-2}) \|_{\mathrm{CUT}({\bf x}, {\bf h}_1,\dots,{\bf h}_{k-2})} \gg \delta^{O(1)}$

and then the conclusion (i) follows from Lemma 6. $\Box$

As an application of degree lowering, we give an inverse theorem for the average in Proposition 1(ii), first established by Bourgain-Chang and later reproved by Peluse (by different methods from those given here):

Proposition 9 Let ${G = {\bf Z}/N{\bf Z}}$ be a cyclic group of prime order. Suppose that one has ${1}$-bounded functions ${f_1,f_2,f_3: G \rightarrow {\bf C}}$ such that

$\displaystyle |\mathop{\bf E}_{x, h \in G} f_1(x) f_2(x+h) f_3(x+h^2)| \geq \delta \ \ \ \ \ (10)$

for some ${\delta > 0}$. Then either ${N \ll \delta^{-O(1)}}$, or one has

$\displaystyle |\mathop{\bf E}_{x \in G} f_1(x)|, |\mathop{\bf E}_{x \in G} f_2(x)| \gg \delta^{O(1)}.$

We remark that a modification of the arguments below also give ${|\mathop{\bf E}_{x \in G} f_3(x)| \gg \delta^{O(1)}}$.

Proof: The left-hand side of (10) can be written as

$\displaystyle |\mathop{\bf E}_{x \in G} F(x) f_3(x)|$

where ${F}$ is the dual function

$\displaystyle F(x) := \mathop{\bf E}_{h \in G} f_1(x-h^2) f_2(x-h^2+h).$

By Cauchy-Schwarz one thus has

$\displaystyle |\mathop{\bf E}_{x \in G} F(x) \overline{F}(x)| \geq \delta^2$

and hence by Proposition 1, we either have ${N \ll \delta^{-O(1)}}$ (in which case we are done) or

$\displaystyle \|F\|_{U^4(G)} \gg \delta^2.$

Writing ${F = \mathop{\bf E}_{h \in G} F_h}$ with ${F_h(x) := f_1(x-h^2) f_2(x-h^2+h)}$, we conclude that either ${\|F\|_{U^3(G)} \gg \delta^{O(1)}}$, or that

$\displaystyle |\mathop{\bf E}_{x,h \in G} \Delta_{h_1} \Delta_{h_2} F_h(x) e(\xi x / N )| \gg \delta^{O(1)}$

for some ${h_1,h_2 \in G}$ and non-zero ${\xi \in G}$. The left-hand side can be rewritten as

$\displaystyle |\mathop{\bf E}_{x,h \in G} g_1(x-h^2) g_2(x-h^2+h) e(\xi x/N)|$

where ${g_1 = \Delta_{h_1} \Delta_{h_2} f_1}$ and ${g_2 = \Delta_{h_1} \Delta_{h_2} f_2}$. We can rewrite this in turn as

$\displaystyle |\mathop{\bf E}_{x,y \in G} g_1(x) g_2(y) e(\xi (x + (y-x)^2) / N)|$

which is bounded by

$\displaystyle \| e(\xi({\bf x} + ({\bf y}-{\bf x})^2)/N) \|_{\mathrm{CUT}({\bf x}, {\bf y})}$

where ${{\bf x}, {\bf y}}$ are independent random variables drawn uniformly from ${G}$. Applying Lemma 5(v), we conclude that

$\displaystyle \| \Box_{{\bf y}, {\bf y}'} e(\xi({\bf x} + ({\bf y}-{\bf x})^2)/N) \|_{\mathrm{CUT}({\bf x}; {\bf y}, {\bf y}')} \gg \delta^{O(1)}.$

However, a routine Gauss sum calculation reveals that the left-hand side is ${O(N^{-c})}$ for some absolute constant ${c>0}$ because ${\xi}$ is non-zero, so that ${N \ll \delta^{-O(1)}}$. The only remaining case to consider is when

$\displaystyle \|F\|_{U^3(G)} \gg \delta^{O(1)}.$

Repeating the above arguments we then conclude that

$\displaystyle \|F\|_{U^2(G)} \gg \delta^{O(1)},$

and then

$\displaystyle \|F\|_{U^1(G)} \gg \delta^{O(1)}.$

The left-hand side can be computed to equal ${|\mathop{\bf E}_{x \in G} f_1(x)| |\mathop{\bf E}_{x \in G} f_2(x)|}$, and the claim follows. $\Box$

This argument was given for the cyclic group setting, but the argument can also be applied to the integers (see Peluse-Prendiville) and can also be used to establish an analogue over the reals (that was first obtained by Bourgain).

The National Academies of Sciences, Engineering, and Medicine have initiated a project on “Illustrating the Impact of the Mathematical Sciences“, in which various media will be produced to showcase how mathematics impacts the modern world.  (I am serving on the committee for creating this media, which has been an interesting experience; the first time for instance that I have had to seriously interact with graphic designers.)  One of the first products is a “webinar” series on the ten topics our committee have chosen to focus on, that is currently running weekly on Tuesdays.  Last week I moderated the first such webinar, titled “From Solving to Seeing”, in which Profs. Gunther Uhlmann and Anna Gilbert presented ways in which inverse problems, compressed sensing, and other modern mathematical techniques have been used to obtain images (such as MRI images) that would not otherwise be accessible.  Next week I will moderate another webinar, titled “Abstract Geometry, Concrete Impact”, in which Profs. Katherine Stange and Jordan Ellenberg will discuss how modern abstract geometries are used in modern applications such as cryptography.  The full list of webinars and the latest information on the speakers can be found at this website.  (Past webinars can be viewed directly from the web site; live webinars require a (free) registration, and offer the ability to submit text questions to the speakers via the moderator.)

We are currently in the process of designing posters (and possibly even a more interactive online resource) for each of the ten topics listed in the webinars; hopefully these will be available in a few months.

I have just returned from Basel, Switzerland, on the occasion of the awarding of the 2019 Ostrowski prize to Assaf Naor.  I was invited to give the laudatio for Assaf’s work, which I have uploaded here.  I also gave a public lecture (intended at the high school student level) at the University of Basel entitled “The Notorious Collatz conjecture”; I have uploaded the slides for that here. (Note that the slides here are somewhat unpolished as I was not initially planning to make them public until I was recently requested to do so.  In particular I do not have full attribution for some of the images used in the slides.)

Basel has historically been home to a number of very prominent mathematicians, most notably Jacob Bernoulli, whose headstone I saw at the Basel Minster,

and also Leonhard Euler, for which I could not find a formal memorial, but I did at least see a hotel bearing his name:

I just heard the news that Louis Nirenberg died a few days ago, aged 94.  Nirenberg made a vast number of contributions to analysis and PDE (and his work has come up repeatedly on my own blog); I wrote about his beautiful moving planes argument with Gidas and Ni to establish symmetry of ground states in this post on the occasion of him receiving the Chern medal, and on how his extremely useful interpolation inequality with Gagliardo (generalising a previous inequality of Ladyzhenskaya) can be viewed as an amplification of the usual Sobolev inequality in this post.  Another fundamentally useful inequality of Nirenberg is the John-Nirenberg inequality established with Fritz John: if a (locally integrable) function $f: {\bf R} \to {\bf R}$ (which for simplicity of exposition we place in one dimension) obeys the bounded mean oscillation property

$\displaystyle \frac{1}{|I|} \int_I |f(x)-f_I|\ dx \leq A \quad (1)$

for all intervals $I$, where $f_I := \frac{1}{|I|} \int_I f$ is the average value of $f$ on $I$, then one has exponentially good large deviation estimates

$\displaystyle \frac{1}{|I|} |\{ x \in I: |f(x)-f_I| \geq \lambda A \}| \leq \exp( - c \lambda ) \quad (2)$

for all $\lambda>0$ and some absolute constant $c$.  This can be compared with Markov’s inequality, which only gives the far weaker decay

$\displaystyle \frac{1}{|I|} |\{ x \in I: |f(x)-f_I| \geq \lambda A \}| \leq \frac{1}{\lambda}. \quad (3)$

The point is that (1) is assumed to hold not just for a given interval $I$, but also all subintervals of $I$, and this is a much more powerful hypothesis, allowing one for instance to use the standard Calderon-Zygmund technique of stopping time arguments to “amplify” (3) to (2).  Basically, for any given interval $I$, one can use (1) and repeated halving of the interval $I$ until significant deviation from the mean is encountered to locate some disjoint exceptional subintervals $J$ where $f_J$ deviates from $f_I$ by $O(A)$, with the total measure of the $J$ being a small fraction of that of $I$ (thanks to a variant of (3)), and with $f$ staying within $O(A)$ of $f_I$ at almost every point of $I$ outside of these exceptional intervals.  One can then establish (2) by an induction on $\lambda$.  (There are other proofs of this inequality also, e.g., one can use Bellman functions, as discussed in this old set of notes of mine.)   Informally, the John-Nirenberg inequality asserts that functions of bounded mean oscillation are “almost as good” as bounded functions, in that they almost always stay within a bounded distance from their mean, and in fact the space BMO of functions of bounded mean oscillation ends up being superior to the space $L^\infty$ of bounded measurable functions for many harmonic analysis purposes (among other things, the space is more stable with respect to singular integral operators).

I met Louis a few times in my career; even in his later years when he was wheelchair-bound, he would often come to conferences and talks, and ask very insightful questions at the end of the lecture (even when it looked like he was asleep during much of the actual talk!).  I have a vague memory of him asking me some questions in one of the early talks I gave as a postdoc; I unfortunately do not remember exactly what the topic was (some sort of PDE, I think), but I was struck by how kindly the questions were posed, and how patiently he would listen to my excited chattering about my own work.

Define the Collatz map ${\mathrm{Col}: {\bf N}+1 \rightarrow {\bf N}+1}$ on the natural numbers ${{\bf N}+1 = \{1,2,\dots\}}$ by setting ${\mathrm{Col}(N)}$ to equal ${3N+1}$ when ${N}$ is odd and ${N/2}$ when ${N}$ is even, and let ${\mathrm{Col}^{\bf N}(N) := \{ N, \mathrm{Col}(N), \mathrm{Col}^2(N), \dots \}}$ denote the forward Collatz orbit of ${N}$. The notorious Collatz conjecture asserts that ${1 \in \mathrm{Col}^{\bf N}(N)}$ for all ${N \in {\bf N}+1}$. Equivalently, if we define the backwards Collatz orbit ${(\mathrm{Col}^{\bf N})^*(N) := \{ M \in {\bf N}+1: N \in \mathrm{Col}^{\bf N}(M) \}}$ to be all the natural numbers ${M}$ that encounter ${N}$ in their forward Collatz orbit, then the Collatz conjecture asserts that ${(\mathrm{Col}^{\bf N})^*(1) = {\bf N}+1}$. As a partial result towards this latter statement, Krasikov and Lagarias in 2003 established the bound

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Col}^{\bf N})^*(1) \} \gg x^\gamma \ \ \ \ \ (1)$

for all ${x \geq 1}$ and ${\gamma = 0.84}$. (This improved upon previous values of ${\gamma = 0.81}$ obtained by Applegate and Lagarias in 1995, ${\gamma = 0.65}$ by Applegate and Lagarias in 1995 by a different method, ${\gamma=0.48}$ by Wirsching in 1993, ${\gamma=0.43}$ by Krasikov in 1989, ${\gamma=0.3}$ by Sander in 1990, and some ${\gamma>0}$ by Crandall in 1978.) This is still the largest value of ${\gamma}$ for which (1) has been established. Of course, the Collatz conjecture would imply that we can take ${\gamma}$ equal to ${1}$, which is the assertion that a positive density set of natural numbers obeys the Collatz conjecture. This is not yet established, although the results in my previous paper do at least imply that a positive density set of natural numbers iterates to an (explicitly computable) bounded set, so in principle the ${\gamma=1}$ case of (1) could now be verified by an (enormous) finite computation in which one verifies that every number in this explicit bounded set iterates to ${1}$. In this post I would like to record a possible alternate route to this problem that depends on the distribution of a certain family of random variables that appeared in my previous paper, that I called Syracuse random variables.

Definition 1 (Syracuse random variables) For any natural number ${n}$, a Syracuse random variable ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$ on the cyclic group ${{\bf Z}/3^n{\bf Z}}$ is defined as a random variable of the form

$\displaystyle \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = \sum_{m=1}^n 3^{n-m} 2^{-{\mathbf a}_m-\dots-{\mathbf a}_n} \ \ \ \ \ (2)$

where ${\mathbf{a}_1,\dots,\mathbf{a_n}}$ are independent copies of a geometric random variable ${\mathbf{Geom}(2)}$ on the natural numbers with mean ${2}$, thus

$\displaystyle \mathop{\bf P}( \mathbf{a}_1=a_1,\dots,\mathbf{a}_n=a_n) = 2^{-a_1-\dots-a_n}$

} for ${a_1,\dots,a_n \in {\bf N}+1}$. In (2) the arithmetic is performed in the ring ${{\bf Z}/3^n{\bf Z}}$.

Thus for instance

$\displaystyle \mathrm{Syrac}({\bf Z}/3{\bf Z}) = 2^{-\mathbf{a}_1} \hbox{ mod } 3$

$\displaystyle \mathrm{Syrac}({\bf Z}/3^2{\bf Z}) = 2^{-\mathbf{a}_1-\mathbf{a}_2} + 3 \times 2^{-\mathbf{a}_2} \hbox{ mod } 3^2$

$\displaystyle \mathrm{Syrac}({\bf Z}/3^3{\bf Z}) = 2^{-\mathbf{a}_1-\mathbf{a}_2-\mathbf{a}_3} + 3 \times 2^{-\mathbf{a}_2-\mathbf{a}_3} + 3^2 \times 2^{-\mathbf{a}_3} \hbox{ mod } 3^3$

and so forth. After reversing the labeling of the ${\mathbf{a}_1,\dots,\mathbf{a}_n}$, one could also view ${\mathrm{Syrac}({\bf Z}/3^n{\bf Z})}$ as the mod ${3^n}$ reduction of a ${3}$-adic random variable

$\displaystyle \mathbf{Syrac}({\bf Z}_3) = \sum_{m=1}^\infty 3^{m-1} 2^{-{\mathbf a}_1-\dots-{\mathbf a}_m}.$

The probability density function ${b \mapsto \mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b )}$ of the Syracuse random variable can be explicitly computed by a recursive formula (see Lemma 1.12 of my previous paper). For instance, when ${n=1}$, ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3{\bf Z}) = b )}$ is equal to ${0,1/3,2/3}$ for ${x=b,1,2 \hbox{ mod } 3}$ respectively, while when ${n=2}$, ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3^2{\bf Z}) = b )}$ is equal to

$\displaystyle 0, \frac{8}{63}, \frac{16}{63}, 0, \frac{11}{63}, \frac{4}{63}, 0, \frac{2}{63}, \frac{22}{63}$

when ${b=0,\dots,8 \hbox{ mod } 9}$ respectively.

The relationship of these random variables to the Collatz problem can be explained as follows. Let ${2{\bf N}+1 = \{1,3,5,\dots\}}$ denote the odd natural numbers, and define the Syracuse map ${\mathrm{Syr}: 2{\bf N}+1 \rightarrow 2{\bf N}+1}$ by

$\displaystyle \mathrm{Syr}(N) := \frac{3n+1}{2^{\nu_2(3N+1)}}$

where the ${2}$valuation ${\nu_2(3n+1) \in {\bf N}}$ is the number of times ${2}$ divides ${3N+1}$. We can define the forward orbit ${\mathrm{Syr}^{\bf N}(n)}$ and backward orbit ${(\mathrm{Syr}^{\bf N})^*(N)}$ of the Syracuse map as before. It is not difficult to then see that the Collatz conjecture is equivalent to the assertion ${(\mathrm{Syr}^{\bf N})^*(1) = 2{\bf N}+1}$, and that the assertion (1) for a given ${\gamma}$ is equivalent to the assertion

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Syr}^{\bf N})^*(1) \} \gg x^\gamma \ \ \ \ \ (3)$

for all ${x \geq 1}$, where ${N}$ is now understood to range over odd natural numbers. A brief calculation then shows that for any odd natural number ${N}$ and natural number ${n}$, one has

$\displaystyle \mathrm{Syr}^n(N) = 3^n 2^{-a_1-\dots-a_n} N + \sum_{m=1}^n 3^{n-m} 2^{-a_m-\dots-a_n}$

where the natural numbers ${a_1,\dots,a_n}$ are defined by the formula

$\displaystyle a_i := \nu_2( 3 \mathrm{Syr}^{i-1}(N) + 1 ),$

so in particular

$\displaystyle \mathrm{Syr}^n(N) = \sum_{m=1}^n 3^{n-m} 2^{-a_m-\dots-a_n} \hbox{ mod } 3^n.$

Heuristically, one expects the ${2}$-valuation ${a = \nu_2(N)}$ of a typical odd number ${N}$ to be approximately distributed according to the geometric distribution ${\mathbf{Geom}(2)}$, so one therefore expects the residue class ${\mathrm{Syr}^n(N) \hbox{ mod } 3^n}$ to be distributed approximately according to the random variable ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$.

The Syracuse random variables ${\mathbf{Syrac}({\bf Z}/3^n{\bf Z})}$ will always avoid multiples of three (this reflects the fact that ${\mathrm{Syr}(N)}$ is never a multiple of three), but attains any non-multiple of three in ${{\bf Z}/3^n{\bf Z}}$ with positive probability. For any natural number ${n}$, set

$\displaystyle c_n := \inf_{b \in {\bf Z}/3^n{\bf Z}: 3 \nmid b} \mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b ).$

Equivalently, ${c_n}$ is the greatest quantity for which we have the inequality

$\displaystyle \sum_{(a_1,\dots,a_n) \in S_{n,N}} 2^{-a_1-\dots-a_m} \geq c_n \ \ \ \ \ (4)$

for all integers ${N}$ not divisible by three, where ${S_{n,N} \subset ({\bf N}+1)^n}$ is the set of all tuples ${(a_1,\dots,a_n)}$ for which

$\displaystyle N = \sum_{m=1}^n 3^{m-1} 2^{-a_1-\dots-a_m} \hbox{ mod } 3^n.$

Thus for instance ${c_0=1}$, ${c_1 = 1/3}$, and ${c_2 = 2/63}$. On the other hand, since all the probabilities ${\mathbf{P}( \mathbf{Syrac}({\bf Z}/3^n{\bf Z}) = b)}$ sum to ${1}$ as ${b \in {\bf Z}/3^n{\bf Z}}$ ranges over the non-multiples of ${3}$, we have the trivial upper bound

$\displaystyle c_n \leq \frac{3}{2} 3^{-n}.$

There is also an easy submultiplicativity result:

Lemma 2 For any natural numbers ${n_1,n_2}$, we have

$\displaystyle c_{n_1+n_2-1} \geq c_{n_1} c_{n_2}.$

Proof: Let ${N}$ be an integer not divisible by ${3}$, then by (4) we have

$\displaystyle \sum_{(a_1,\dots,a_{n_1}) \in S_{n_1,N}} 2^{-a_1-\dots-a_{n_1}} \geq c_{n_1}.$

If we let ${S'_{n_1,N}}$ denote the set of tuples ${(a_1,\dots,a_{n_1-1})}$ that can be formed from the tuples in ${S_{n_1,N}}$ by deleting the final component ${a_{n_1}}$ from each tuple, then we have

$\displaystyle \sum_{(a_1,\dots,a_{n_1-1}) \in S'_{n_1,N}} 2^{-a_1-\dots-a_{n_1-1}} \geq c_{n_1}. \ \ \ \ \ (5)$

Next, observe that if ${(a_1,\dots,a_{n_1-1}) \in S'_{n_1,N}}$, then

$\displaystyle N = \sum_{m=1}^{n_1-1} 3^{m-1} 2^{-a_1-\dots-a_m} + 3^{n_1-1} 2^{-a_1-\dots-a_{n_1-1}} M$

with ${M = M_{N,n_1,a_1,\dots,a_{n_1-1}}}$ an integer not divisible by three. By definition of ${S_{n_2,M}}$ and a relabeling, we then have

$\displaystyle M = \sum_{m=1}^{n_2} 3^{m-1} 2^{-a_{n_1}-\dots-a_{m+n_1-1}} \hbox{ mod } 3^{n_2}$

for all ${(a_{n_1},\dots,a_{n_1+n_2-1}) \in S_{n_2,M}}$. For such tuples we then have

$\displaystyle N = \sum_{m=1}^{n_1+n_2-1} 3^{m-1} 2^{-a_1-\dots-a_{n_1+n_2-1}} \hbox{ mod } 3^{n_1+n_2-1}$

so that ${(a_1,\dots,a_{n_1+n_2-1}) \in S_{n_1+n_2-1,N}}$. Since

$\displaystyle \sum_{(a_{n_1},\dots,a_{n_1+n_2-1}) \in S_{n_2,M}} 2^{-a_{n_1}-\dots-a_{n_1+n_2-1}} \geq c_{n_2}$

for each ${M}$, the claim follows. $\Box$

From this lemma we see that ${c_n = 3^{-\beta n + o(n)}}$ for some absolute constant ${\beta \geq 1}$. Heuristically, we expect the Syracuse random variables to be somewhat approximately equidistributed amongst the multiples of ${{\bf Z}/3^n{\bf Z}}$ (in Proposition 1.4 of my previous paper I prove a fine scale mixing result that supports this heuristic). As a consequence it is natural to conjecture that ${\beta=1}$. I cannot prove this, but I can show that this conjecture would imply that we can take the exponent ${\gamma}$ in (1), (3) arbitrarily close to one:

Proposition 3 Suppose that ${\beta=1}$ (that is to say, ${c_n = 3^{-n+o(n)}}$ as ${n \rightarrow \infty}$). Then

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Syr}^{\bf N})^*(1) \} \gg x^{1-o(1)}$

as ${x \rightarrow \infty}$, or equivalently

$\displaystyle \# \{ N \leq x: N \in (\mathrm{Col}^{\bf N})^*(1) \} \gg x^{1-o(1)}$

as ${x \rightarrow \infty}$. In other words, (1), (3) hold for all ${\gamma < 1}$.

I prove this proposition below the fold. A variant of the argument shows that for any value of ${\beta}$, (1), (3) holds whenever ${\gamma < f(\beta)}$, where ${f: [0,1] \rightarrow [0,1]}$ is an explicitly computable function with ${f(\beta) \rightarrow 1}$ as ${\beta \rightarrow 1}$. In principle, one could then improve the Krasikov-Lagarias result ${\gamma = 0.84}$ by getting a sufficiently good upper bound on ${\beta}$, which is in principle achievable numerically (note for instance that Lemma 2 implies the bound ${c_n \leq 3^{-\beta(n-1)}}$ for any ${n}$, since ${c_{kn-k+1} \geq c_n^k}$ for any ${k}$).

Just a brief post to record some notable papers in my fields of interest that appeared on the arXiv recently.

• A sharp square function estimate for the cone in ${\bf R}^3$“, by Larry Guth, Hong Wang, and Ruixiang Zhang.  This paper establishes an optimal (up to epsilon losses) square function estimate for the three-dimensional light cone that was essentially conjectured by Mockenhaupt, Seeger, and Sogge, which has a number of other consequences including Sogge’s local smoothing conjecture for the wave equation in two spatial dimensions, which in turn implies the (already known) Bochner-Riesz, restriction, and Kakeya conjectures in two dimensions.   Interestingly, modern techniques such as polynomial partitioning and decoupling estimates are not used in this argument; instead, the authors mostly rely on an induction on scales argument and Kakeya type estimates.  Many previous authors (including myself) were able to get weaker estimates of this type by an induction on scales method, but there were always significant inefficiencies in doing so; in particular knowing the sharp square function estimate at smaller scales did not imply the sharp square function estimate at the given larger scale.  The authors here get around this issue by finding an even stronger estimate that implies the square function estimate, but behaves significantly better with respect to induction on scales.
• On the Chowla and twin primes conjectures over ${\mathbb F}_q[T]$“, by Will Sawin and Mark Shusterman.  This paper resolves a number of well known open conjectures in analytic number theory, such as the Chowla conjecture and the twin prime conjecture (in the strong form conjectured by Hardy and Littlewood), in the case of function fields where the field is a prime power $q=p^j$ which is fixed (in contrast to a number of existing results in the “large $q$” limit) but has a large exponent $j$.  The techniques here are orthogonal to those used in recent progress towards the Chowla conjecture over the integers (e.g., in this previous paper of mine); the starting point is an algebraic observation that in certain function fields, the Mobius function behaves like a quadratic Dirichlet character along certain arithmetic progressions.  In principle, this reduces problems such as Chowla’s conjecture to problems about estimating sums of Dirichlet characters, for which more is known; but the task is still far from trivial.
• Bounds for sets with no polynomial progressions“, by Sarah Peluse.  This paper can be viewed as part of a larger project to obtain quantitative density Ramsey theorems of Szemeredi type.  For instance, Gowers famously established a relatively good quantitative bound for Szemeredi’s theorem that all dense subsets of integers contain arbitrarily long arithmetic progressions $a, a+r, \dots, a+(k-1)r$.  The corresponding question for polynomial progressions $a+P_1(r), \dots, a+P_k(r)$ is considered more difficult for a number of reasons.  One of them is that dilation invariance is lost; a dilation of an arithmetic progression is again an arithmetic progression, but a dilation of a polynomial progression will in general not be a polynomial progression with the same polynomials $P_1,\dots,P_k$.  Another issue is that the ranges of the two parameters $a,r$ are now at different scales.  Peluse gets around these difficulties in the case when all the polynomials $P_1,\dots,P_k$ have distinct degrees, which is in some sense the opposite case to that considered by Gowers (in particular, she avoids the need to obtain quantitative inverse theorems for high order Gowers norms; which was recently obtained in this integer setting by Manners but with bounds that are probably not strong enough to for the bounds in Peluse’s results, due to a degree lowering argument that is available in this case).  To resolve the first difficulty one has to make all the estimates rather uniform in the coefficients of the polynomials $P_j$, so that one can still run a density increment argument efficiently.  To resolve the second difficulty one needs to find a quantitative concatenation theorem for Gowers uniformity norms.  Many of these ideas were developed in previous papers of Peluse and Peluse-Prendiville in simpler settings.
• On blow up for the energy super critical defocusing non linear Schrödinger equations“, by Frank Merle, Pierre Raphael, Igor Rodnianski, and Jeremie Szeftel.  This paper (when combined with two companion papers) resolves a long-standing problem as to whether finite time blowup occurs for the defocusing supercritical nonlinear Schrödinger equation (at least in certain dimensions and nonlinearities).  I had a previous paper establishing a result like this if one “cheated” by replacing the nonlinear Schrodinger equation by a system of such equations, but remarkably they are able to tackle the original equation itself without any such cheating.  Given the very analogous situation with Navier-Stokes, where again one can create finite time blowup by “cheating” and modifying the equation, it does raise hope that finite time blowup for the incompressible Navier-Stokes and Euler equations can be established…  In fact the connection may not just be at the level of analogy; a surprising key ingredient in the proofs here is the observation that a certain blowup ansatz for the nonlinear Schrodinger equation is governed by solutions to the (compressible) Euler equation, and finite time blowup examples for the latter can be used to construct finite time blowup examples for the former.

Let ${u: {\bf R}^3 \rightarrow {\bf R}^3}$ be a divergence-free vector field, thus ${\nabla \cdot u = 0}$, which we interpret as a velocity field. In this post we will proceed formally, largely ignoring the analytic issues of whether the fields in question have sufficient regularity and decay to justify the calculations. The vorticity field ${\omega: {\bf R}^3 \rightarrow {\bf R}^3}$ is then defined as the curl of the velocity:

$\displaystyle \omega = \nabla \times u.$

(From a differential geometry viewpoint, it would be more accurate (especially in other dimensions than three) to define the vorticity as the exterior derivative ${\omega = d(g \cdot u)}$ of the musical isomorphism ${g \cdot u}$ of the Euclidean metric ${g}$ applied to the velocity field ${u}$; see these previous lecture notes. However, we will not need this geometric formalism in this post.)

Assuming suitable regularity and decay hypotheses of the velocity field ${u}$, it is possible to recover the velocity from the vorticity as follows. From the general vector identity ${\nabla \times \nabla \times X = \nabla(\nabla \cdot X) - \Delta X}$ applied to the velocity field ${u}$, we see that

$\displaystyle \nabla \times \omega = -\Delta u$

and thus (by the commutativity of all the differential operators involved)

$\displaystyle u = - \nabla \times \Delta^{-1} \omega.$

Using the Newton potential formula

$\displaystyle -\Delta^{-1} \omega(x) := \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y)}{|x-y|}\ dy$

and formally differentiating under the integral sign, we obtain the Biot-Savart law

$\displaystyle u(x) = \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy. \ \ \ \ \ (1)$

This law is of fundamental importance in the study of incompressible fluid equations, such as the Euler equations

$\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p; \quad \nabla \cdot u = 0$

since on applying the curl operator one obtains the vorticity equation

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (2)$

and then by substituting (1) one gets an autonomous equation for the vorticity field ${\omega}$. Unfortunately, this equation is non-local, due to the integration present in (1).

In a recent work, it was observed by Elgindi that in a certain regime, the Biot-Savart law can be approximated by a more “low rank” law, which makes the non-local effects significantly simpler in nature. This simplification was carried out in spherical coordinates, and hinged on a study of the invertibility properties of a certain second order linear differential operator in the latitude variable ${\theta}$; however in this post I would like to observe that the approximation can also be seen directly in Cartesian coordinates from the classical Biot-Savart law (1). As a consequence one can also initiate the beginning of Elgindi’s analysis in constructing somewhat regular solutions to the Euler equations that exhibit self-similar blowup in finite time, though I have not attempted to execute the entirety of the analysis in this setting.

Elgindi’s approximation applies under the following hypotheses:

A model example of a divergence-free vector field obeying these properties (but without good decay at infinity) is the linear vector field

$\displaystyle X(x) = (x_1, x_2, -2x_3) \ \ \ \ \ (5)$

which is of the form (3) with ${u_r(r,x_3) = r}$ and ${u_3(r,x_3) = -2x_3}$. The associated vorticity ${\omega}$ vanishes.

We can now give an illustration of Elgindi’s approximation:

Proposition 1 (Elgindi’s approximation) Under the above hypotheses (and assuing suitable regularity and decay), we have the pointwise bounds

$\displaystyle u(x) = \frac{1}{2} {\mathcal L}_{12}(\omega)(|x|) X(x) + O( |x| \|\omega\|_{L^\infty({\bf R}^3)} )$

for any ${x \in {\bf R}^3}$, where ${X}$ is the vector field (5), and ${{\mathcal L}_{12}(\omega): {\bf R}^+ \rightarrow {\bf R}}$ is the scalar function

$\displaystyle {\mathcal L}_{12}(\omega)(\rho) := \frac{3}{4\pi} \int_{|y| \geq \rho} \frac{r y_3}{|y|^5} \omega_{r3}(r,y_3)\ dy.$

Thus under the hypotheses (i), (ii), and assuming that ${\omega}$ is slowly varying, we expect ${u}$ to behave like the linear vector field ${X}$ modulated by a radial scalar function. In applications one needs to control the error in various function spaces instead of pointwise, and with ${\omega}$ similarly controlled in other function space norms than the ${L^\infty}$ norm, but this proposition already gives a flavour of the approximation. If one uses spherical coordinates

$\displaystyle \omega_{r3}( \rho \cos \theta, \rho \sin \theta ) = \Omega( \rho, \theta )$

then we have (using the spherical change of variables formula ${dy = \rho^2 \cos \theta d\rho d\theta d\phi}$ and the odd nature of ${\Omega}$)

$\displaystyle {\mathcal L}_{12}(\omega) = L_{12}(\Omega),$

where

$\displaystyle L_{12}(\Omega)(\rho) = 3 \int_\rho^\infty \int_0^{\pi/2} \frac{\Omega(r, \theta) \sin(\theta) \cos^2(\theta)}{r}\ d\theta dr$

is the operator introduced in Elgindi’s paper.

Proof: By a limiting argument we may assume that ${x}$ is non-zero, and we may normalise ${\|\omega\|_{L^\infty({\bf R}^3)}=1}$. From the triangle inequality we have

$\displaystyle \int_{|y| \leq 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy \leq \int_{|y| \leq 10|x|} \frac{1}{|x-y|^2}\ dy$

$\displaystyle \leq \int_{|z| \leq 11 |x|} \frac{1}{|z|^2}\ dz$

$\displaystyle = O( |x| )$

and hence by (1)

$\displaystyle u(x) = \frac{1}{4\pi} \int_{|y| > 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy + O(|x|).$

In the regime ${|y| > 2|x|}$ we may perform the Taylor expansion

$\displaystyle \frac{x-y}{|x-y|^3} = \frac{x-y}{|y|^3} (1 - \frac{2 x \cdot y}{|y|^2} + \frac{|x|^2}{|y|^2})^{-3/2}$

$\displaystyle = \frac{x-y}{|y|^3} (1 + \frac{3 x \cdot y}{|y|^2} + O( \frac{|x|^2}{|y|^2} ) )$

$\displaystyle = -\frac{y}{|y|^3} + \frac{x}{|y|^3} - \frac{3 (x \cdot y) y}{|y|^5} + O( \frac{|x|^2}{|y|^4} ).$

Since

$\displaystyle \int_{|y| > 10|x|} \frac{|x|^2}{|y|^4}\ dy = O(|x|)$

we see from the triangle inequality that the error term contributes ${O(|x|)}$ to ${u(x)}$. We thus have

$\displaystyle u(x) = -A_0(x) + A_1(x) - 3A'_1(x) + O(|x|)$

where ${A_0}$ is the constant term

$\displaystyle A_0 := \int_{|y| > 10|x|} \frac{\omega(y) \times y}{|y|^3}\ dy,$

and ${A_1, A'_1}$ are the linear term

$\displaystyle A_1 := \int_{|y| > 10|x|} \frac{\omega(y) \times x}{|y|^3}\ dy,$

$\displaystyle A'_1 := \int_{|y| > 10|x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy.$

By the hypotheses (i), (ii), we have the symmetries

$\displaystyle \omega(y_1,y_2,-y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (6)$

and

$\displaystyle \omega(-y_1,-y_2,y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (7)$

and hence also

$\displaystyle \omega(-y_1,-y_2,-y_3) = \omega(y_1,y_2,y_3). \ \ \ \ \ (8)$

The even symmetry (8) ensures that the integrand in ${A_0}$ is odd, so ${A_0}$ vanishes. The symmetry (6) or (7) similarly ensures that ${\int_{|y| > 10|x|} \frac{\omega(y)}{|y|^3}\ dy = 0}$, so ${A_1}$ vanishes. Since ${\int_{|x| < y \leq 10|x|} \frac{|x \cdot y| |y|}{|y|^5}\ dy = O( |x| )}$, we conclude that

$\displaystyle \omega(x) = -3\int_{|y| \geq |x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy + O(|x|).$

Using (4), the right-hand side is

$\displaystyle -3\int_{|y| \geq |x|} (x_1 y_1 + x_2 y_2 + x_3 y_3) \frac{\omega_{r3}(r,y_3) (-y_1 y_3, -y_2 y_3, y_1^2+y_2^2)}{r|y|^5}\ dy$

$\displaystyle + O(|x|)$

where ${r := \sqrt{y_1^2+y_2^2}}$. Because of the odd nature of ${\omega_{r3}}$, only those terms with one factor of ${y_3}$ give a non-vanishing contribution to the integral. Using the rotation symmetry ${(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)}$ we also see that any term with a factor of ${y_1 y_2}$ also vanishes. We can thus simplify the above expression as

$\displaystyle -3\int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (-x_1 y_1^2 y_3, -x_2 y_2^2 y_3, x_3 (y_1^2+y_2^2) y_3)}{r|y|^5}\ dy + O(|x|).$

Using the rotation symmetry ${(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)}$ again, we see that the term ${y_1^2}$ in the first component can be replaced by ${y_2^2}$ or by ${\frac{1}{2} (y_1^2+y_2^2) = \frac{r^2}{2}}$, and similarly for the ${y_2^2}$ term in the second component. Thus the above expression is

$\displaystyle \frac{3}{2} \int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (x_1 , x_2, -2x_3) r y_3}{|y|^5}\ dy + O(|x|)$

giving the claim. $\Box$

Example 2 Consider the divergence-free vector field ${u := \nabla \times \psi}$, where the vector potential ${\psi}$ takes the form

$\displaystyle \psi(x_1,x_2,x_3) := (x_2 x_3, -x_1 x_3, 0) \eta(|x|)$

for some bump function ${\eta: {\bf R} \rightarrow {\bf R}}$ supported in ${(0,+\infty)}$. We can then calculate

$\displaystyle u(x_1,x_2,x_3) = X(x) \eta(|x|) + (x_1 x_3, x_2 x_3, -x_1^2-x_2^2) \frac{\eta'(|x|) x_3}{|x|}.$

and

$\displaystyle \omega(x_1,x_2,x_3) = (-6x_2 x_3, 6x_1 x_3, 0) \frac{\eta'(|x|)}{|x|} + (-x_2 x_3, x_1 x_3, 0) \eta''(|x|).$

In particular the hypotheses (i), (ii) are satisfied with

$\displaystyle \omega_{r3}(r,x_3) = - 6 \eta'(|x|) \frac{x_3 r}{|x|} - \eta''(|x|) x_3 r.$

One can then calculate

$\displaystyle L_{12}(\omega)(\rho) = -\frac{3}{4\pi} \int_{|y| \geq \rho} (6\frac{\eta'(|y|)}{|y|^6} + \frac{\eta''(|y|)}{|y|^5}) r^2 y_3^2\ dy$

$\displaystyle = -\frac{2}{5} \int_\rho^\infty 6\eta'(s) + s\eta''(s)\ ds$

$\displaystyle = 2\eta(\rho) + \frac{2}{5} \rho \eta'(\rho).$

If we take the specific choice

$\displaystyle \eta(\rho) = \varphi( \rho^\alpha )$

where ${\varphi}$ is a fixed bump function supported some interval ${[c,C] \subset (0,+\infty)}$ and ${\alpha>0}$ is a small parameter (so that ${\eta}$ is spread out over the range ${\rho \in [c^{1/\alpha},C^{1/\alpha}]}$), then we see that

$\displaystyle \| \omega \|_{L^\infty} = O( \alpha )$

(with implied constants allowed to depend on ${\varphi}$),

$\displaystyle L_{12}(\omega)(\rho) = 2\eta(\rho) + O(\alpha),$

and

$\displaystyle u = X(x) \eta(|x|) + O( \alpha |x| ),$

which is completely consistent with Proposition 1.

One can use this approximation to extract a plausible ansatz for a self-similar blowup to the Euler equations. We let ${\alpha>0}$ be a small parameter and let ${\omega_{rx_3}}$ be a time-dependent vorticity field obeying (i), (ii) of the form

$\displaystyle \omega_{rx_3}(t,r,x_3) \approx \alpha \Omega( t, R ) \mathrm{sgn}(x_3)$

where ${R := |x|^\alpha = (r^2+x_3^2)^{\alpha/2}}$ and ${\Omega: {\bf R} \times [0,+\infty) \rightarrow {\bf R}}$ is a smooth field to be chosen later. Admittedly the signum function ${\mathrm{sgn}}$ is not smooth at ${x_3}$, but let us ignore this issue for now (to rigorously make an ansatz one will have to smooth out this function a little bit; Elgindi uses the choice ${(|\sin \theta| \cos^2 \theta)^{\alpha/3} \mathrm{sgn}(x_3)}$, where ${\theta := \mathrm{arctan}(x_3/r)}$). With this ansatz one may compute

$\displaystyle {\mathcal L}_{12}(\omega(t))(\rho) \approx \frac{3\alpha}{2\pi} \int_{|y| \geq \rho; y_3 \geq 0} \Omega(t,R) \frac{r y_3}{|y|^5}\ dy$

$\displaystyle = \alpha \int_\rho^\infty \Omega(t, s^\alpha) \frac{ds}{s}$

$\displaystyle = \int_{\rho^\alpha}^\infty \Omega(t,s) \frac{ds}{s}.$

By Proposition 1, we thus expect to have the approximation

$\displaystyle u(t,x) \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} X(x).$

We insert this into the vorticity equation (2). The transport term ${(u \cdot \nabla) \omega}$ will be expected to be negligible because ${R}$, and hence ${\omega_{rx_3}}$, is slowly varying (the discontinuity of ${\mathrm{sgn}(x_3)}$ will not be encountered because the vector field ${X}$ is parallel to this singularity). The modulating function ${\frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s}}$ is similarly slowly varying, so derivatives falling on this function should be lower order. Neglecting such terms, we arrive at the approximation

$\displaystyle (\omega \cdot \nabla) u \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} \omega$

and so in the limit ${\alpha \rightarrow 0}$ we expect obtain a simple model equation for the evolution of the vorticity envelope ${\Omega}$:

$\displaystyle \partial_t \Omega(t,R) = \frac{1}{2} \int_R^\infty \Omega(t,S) \frac{dS}{S} \Omega(t,R).$

If we write ${L(t,R) := \int_R^\infty \Omega(t,S)\frac{dS}{S}}$ for the logarithmic primitive of ${\Omega}$, then we have ${\Omega = - R \partial_R L}$ and hence

$\displaystyle \partial_t (R \partial_R L) = \frac{1}{2} L (R \partial_R L)$

which integrates to the Ricatti equation

$\displaystyle \partial_t L = \frac{1}{4} L^2$

which can be explicitly solved as

$\displaystyle L(t,R) = \frac{2}{f(R) - t/2}$

where ${f(R)}$ is any function of ${R}$ that one pleases. (In Elgindi’s work a time dilation is used to remove the unsightly factor of ${1/2}$ appearing here in the denominator.) If for instance we set ${f(R) = 1+R}$, we obtain the self-similar solution

$\displaystyle L(t,R) = \frac{2}{1+R-t/2}$

and then on applying ${-R \partial_R}$

$\displaystyle \Omega(t,R) = \frac{2R}{(1+R-t/2)^2}.$

Thus, we expect to be able to construct a self-similar blowup to the Euler equations with a vorticity field approximately behaving like

$\displaystyle \omega(t,x) \approx \alpha \frac{2R}{(1+R-t/2)^2} \mathrm{sgn}(x_3) (\frac{x_2}{r}, -\frac{x_1}{r}, 0)$

and velocity field behaving like

$\displaystyle u(t,x) \approx \frac{1}{1+R-t/2} X(x).$

In particular, ${u}$ would be expected to be of regularity ${C^{1,\alpha}}$ (and smooth away from the origin), and blows up in (say) ${L^\infty}$ norm at time ${t/2 = 1}$, and one has the self-similarity

$\displaystyle u(t,x) = (1-t/2)^{\frac{1}{\alpha}-1} u( 0, \frac{x}{(1-t/2)^{1/\alpha}} )$

and

$\displaystyle \omega(t,x) = (1-t/2)^{-1} \omega( 0, \frac{x}{(1-t/2)^{1/\alpha}} ).$

A self-similar solution of this approximate shape is in fact constructed rigorously in Elgindi’s paper (using spherical coordinates instead of the Cartesian approach adopted here), using a nonlinear stability analysis of the above ansatz. It seems plausible that one could also carry out this stability analysis using this Cartesian coordinate approach, although I have not tried to do this in detail.