In Notes 5, we saw that the Gowers uniformity norms on vector spaces ${{\bf F}^n}$ in high characteristic were controlled by classical polynomial phases ${e(\phi)}$.

Now we study the analogous situation on cyclic groups ${{\bf Z}/N{\bf Z}}$. Here, there is an unexpected surprise: the polynomial phases (classical or otherwise) are no longer sufficient to control the Gowers norms ${U^{s+1}({\bf Z}/N{\bf Z})}$ once ${s}$ exceeds ${1}$. To resolve this problem, one must enlarge the space of polynomials to a larger class. It turns out that there are at least three closely related options for this class: the local polynomials, the bracket polynomials, and the nilsequences. Each of the three classes has its own strengths and weaknesses, but in my opinion the nilsequences seem to be the most natural class, due to the rich algebraic and dynamical structure coming from the nilpotent Lie group undergirding such sequences. For reasons of space we shall focus primarily on the nilsequence viewpoint here.

Traditionally, nilsequences have been defined in terms of linear orbits ${n \mapsto g^n x}$ on nilmanifolds ${G/\Gamma}$; however, in recent years it has been realised that it is convenient for technical reasons (particularly for the quantitative “single-scale” theory) to generalise this setup to that of polynomial orbits ${n \mapsto g(n) \Gamma}$, and this is the perspective we will take here.

A polynomial phase ${n \mapsto e(\phi(n))}$ on a finite abelian group ${H}$ is formed by starting with a polynomial ${\phi: H \rightarrow {\bf R}/{\bf Z}}$ to the unit circle, and then composing it with the exponential function ${e: {\bf R}/{\bf Z} \rightarrow {\bf C}}$. To create a nilsequence ${n \mapsto F(g(n) \Gamma)}$, we generalise this construction by starting with a polynomial ${g \Gamma: H \rightarrow G/\Gamma}$ into a nilmanifold ${G/\Gamma}$, and then composing this with a Lipschitz function ${F: G/\Gamma \rightarrow {\bf C}}$. (The Lipschitz regularity class is convenient for minor technical reasons, but one could also use other regularity classes here if desired.) These classes of sequences certainly include the polynomial phases, but are somewhat more general; for instance, they almost include bracket polynomial phases such as ${n \mapsto e( \lfloor \alpha n \rfloor \beta n )}$. (The “almost” here is because the relevant functions ${F: G/\Gamma \rightarrow {\bf C}}$ involved are only piecewise Lipschitz rather than Lipschitz, but this is primarily a technical issue and one should view bracket polynomial phases as “morally” being nilsequences.)

In these notes we set out the basic theory for these nilsequences, including their equidistribution theory (which generalises the equidistribution theory of polynomial flows on tori from Notes 1) and show that they are indeed obstructions to the Gowers norm being small. This leads to the inverse conjecture for the Gowers norms that shows that the Gowers norms on cyclic groups are indeed controlled by these sequences.

— 1. General theory of polynomial maps —

In previous notes, we defined the notion of a (non-classical) polynomial map ${\phi}$ of degree at most ${d}$ between two additive groups ${H, G}$, to be a map ${\phi: H \rightarrow G}$ obeying the identity

$\displaystyle \partial_{h_1} \ldots \partial_{h_{d+1}} \phi(x) = 0$

for all ${x,h_1,\ldots,h_{d+1} \in H}$, where ${\partial_h \phi(x) := \phi(x+h)-\phi(x)}$ is the additive discrete derivative operator.

There is another way to view this concept. For any ${k,d \geq 0}$, define the Host-Kra group ${HK^{k}(H, \leq d)}$ of ${H}$ of dimension ${k}$ and degree ${d}$ to be the subgroup of ${H^{\{0,1\}^{d}}}$ consisting of all tuples ${(x_\omega)_{\omega \in \{0,1\}^k}}$ obeying the constraints

$\displaystyle \sum_{\omega \in F} (-1)^{|\omega|} x_\omega = 0$

for all faces ${F}$ of the unit cube ${\{0,1\}^k}$ of dimension at least ${d+1}$, where ${|(\omega_1,\ldots,\omega_k)| := \omega_1+\ldots+\omega_k}$. (These constraints are of course trivial if ${k \leq d}$.) A ${r}$-dimensional face of the unit cube ${\{0,1\}^k}$ is of course formed by freezing ${k-r}$ of the coordinates to a fixed value in ${\{0,1\}}$, and letting the remaining ${r}$ coordinates vary freely in ${\{0,1\}}$.

Thus for instance ${HK^2(H, \leq 1)}$ is (essentially) space of parallelograms ${(x,x+h,x+k,x+h+k)}$ in ${H^4}$, while ${HK^2(H,\leq 0)}$ is the diagonal group ${\{ (x,x,x,x): x \in H^4 \}}$, and ${HK^2(H,\leq 2)}$ is all of ${H^4}$.

Exercise 1 Let ${\phi: H \rightarrow G}$ be a map between additive groups, and let ${k > d \geq 0}$. Show that ${\phi}$ is a (non-classical) polynomial of degree at most ${d}$ if it maps ${HK^k(H,\leq 1)}$ to ${HK^k(G,\leq d)}$, i.e. that ${(\phi(x_\omega))_{\omega \in \{0,1\}^k} \in HK^{k}(G,\leq d)}$ whenever ${(x_\omega)_{\omega \in \{0,1\}^{k}} \in HK^{k}(H,\leq 1)}$.

It turns out (somewhat remarkably) that these notions can be satisfactorily generalised to non-abelian setting, this was first observed by Leibman (in these papers, and also later by personal communication, in which the role of the Host-Kra group was emphasised). The (now multiplicative) groups ${H,G}$ need to be equipped with an additional structure, namely that of a filtration.

Definition 1 (Filtration) A filtration on a multiplicative group ${G}$ is a family ${(G_{\geq i})_{i=0}^\infty}$ of subgroups of ${G}$ obeying the nesting property

$\displaystyle G \geq G_{\geq 0} \geq G_{\geq 1} \geq \ldots$

and the filtration property

$\displaystyle [G_{\geq i}, G_{\geq j}] \subset G_{\geq i+j}$

for all ${i,j \geq 0}$, where ${[H,K]}$ is the group generated by ${\{ [h,k]: h \in H, k \in K \}}$, where ${[h,k] := hkh^{-1}k^{-1}}$ is the commutator of ${h}$ and ${k}$. We will refer to the pair ${G_\bullet = (G,(G_{\geq i})_{i=0}^\infty)}$ as a filtered group. We say that an element ${g}$ of ${G}$ has degree ${\geq i}$ if it belongs to ${G_{\geq i}}$, thus for instance a degree ${\geq i}$ and degree ${\geq j}$ element will commute modulo ${\geq i+j}$ errors.

In practice we usually have ${G_{\geq 0} = G}$. As such, we see that ${[G, G_{\geq j}] \subset G_{\geq j}}$ for all ${j}$, and so all the ${G_{\geq j}}$ are normal subgroups of ${G}$.

Exercise 2 Define the lower central series

$\displaystyle G = G_0 = G_1 \geq G_2 \geq \ldots$

of a group ${G}$ by setting ${G_0, G_1 := G}$ and ${G_{i+1} := [G,G_i]}$ for ${i \geq 1}$. Show that the lower central series ${(G_j)_{j =0}^\infty}$ is a filtration of ${G}$. Furthermore, show that the lower central series is the minimal filtration that starts at ${G}$, in the sense that if ${(G'_{\geq j})_{j=0}^\infty}$ is any other filtration with ${G'_{\geq 0}=G}$, then ${G'_{\geq j} \supset G_{\geq j}}$ for all ${j}$.

Example 1 If ${G}$ is an abelian group, and ${d \geq 0}$, we define the degree ${d}$ filtration ${(G,\leq d)}$ on ${G}$ by setting ${G_{\geq i} := G}$ if ${i \leq d}$ and ${G_{\geq i} = \{\hbox{id}\}}$ for ${i>d}$.

Example 2 If ${G_\bullet=(G,(G_{\geq i})_{i=0}^\infty)}$ is a filtered group, and ${k \geq 0}$, we define the shifted filtered group ${G_\bullet^{+k} := (G,(G_{\geq i+k})_{i=0}^\infty)}$; this is clearly again a filtered group.

Definition 2 (Host-Kra groups) Let ${G_\bullet=(G,(G_{\geq i})_{i=0}^\infty)}$ be a filtered group, and let ${k \geq 0}$ be an integer. The Host-Kra group ${HK^k(G_\bullet)}$ is the subgroup of ${G^{\{0,1\}^k}}$ generated by the elements ${g_F}$ with ${F}$ an arbitrary face in ${\{0,1\}^k}$ and ${g}$ an element of ${G_{\geq k - \hbox{dim}(F)}}$, where ${g_F}$ is the element of ${G^{\{0,1\}^k}}$ whose coordinate at ${\omega}$ is equal to ${g}$ when ${\omega \in F}$ and equal to ${\{\hbox{id}\}}$ otherwise.

From construction we see that the Host-Kra group is symmetric with respect to the symmetry group ${S_k \ltimes ({\bf Z}/2{\bf Z})^k}$ of the unit cube ${\{0,1\}^k}$. We will use these symmetries implicitly in the sequel without further comment.

Example 3 Let us parameterise an element of ${G^{\{0,1\}^2}}$ as ${(g_{00}, g_{01}, g_{10}, g_{11})}$. Then ${HK^2(G)}$ is generated by elements of the form ${(g_0,g_0,g_0,g_0)}$ for ${g_0 \in G_{\geq 0}}$, ${(\hbox{id},\hbox{id},g_1,g_1)}$ and ${(\hbox{id},g_1,\hbox{id},g_1)}$, and ${(\hbox{id},\hbox{id},\hbox{id},g_2)}$ for ${g_0 \in G_{\geq 0}, g_1 \in G_{\geq 1}, g_2 \in G_{\geq 2}}$. (This does not cover all the possible faces of ${\{0,1\}^2}$, but it is easy to see that the remaining faces are redundant.) In other words, ${HK^2(G)}$ consists of all group elements of the form ${(g_0,g_0 g_1, g_0 g'_1, g_0 g_1 g'_1 g_2)}$, where ${g_0 \in G_{\geq 0}}$, ${g_1,g'_1 \in G_{\geq 1}}$, and ${g_2 \in G_{\geq 2}}$. This example is generalised in the exercise below.

Exercise 3 Define a lower face to be a face of a discrete cube ${\{0,1\}^k}$ in which all the frozen coefficients are equal to ${0}$. Let us order the lower faces as ${F_1,\ldots,F_{2^k-1}}$ in such a way that ${i \geq j}$ whenever ${F_i}$ is a subface of ${F_j}$. Let ${G_\bullet}$ be a filtered group. Show that every element of ${HK^k(G_\bullet)}$ has a unique representation of the form ${\prod_{i=0}^{2^{k-1}} (g_i)_{F_i}}$, where ${g_i \in G_{\geq k - \hbox{dim}(F_i)}}$ and the product is taken from left to right (say).

Exercise 4 If ${G}$ is an abelian group, show that the group ${HK^k(G, \leq d)}$ defined in Definition 2 agrees with the group defined at the beginning of this section for additive groups (after transcribing the former to multiplicative notation).

Exercise 5 Let ${G_\bullet}$ be a filtered group. Let ${F}$ be an ${r}$-dimensional face of ${\{0,1\}^k}$. Identifying ${F}$ with ${\{0,1\}^r}$ in an obvious manner, we then obtain a restriction homomorphism from ${G^{\{0,1\}^k}}$ with ${G^F \equiv G^{\{0,1\}^r}}$. Show that the restriction of any element of ${HK^k(G_\bullet)}$ to ${G^{\{0,1\}^r}}$ then lies in ${HK^r(G_\bullet)}$.

Exercise 6 Let ${G_\bullet}$ be a filtered group, let ${k \geq 0}$ and ${l \geq 1}$ be integers, and let ${g = (g_\omega)_{\omega \in \{0,1\}^k}}$ and ${h = (h_\omega)_{\omega \in \{0,1\}^k}}$ be elements of ${G^{\{0,1\}^k}}$. Let ${f = (f_\omega)_{\omega \in \{0,1\}^{k+l}}}$ be the element of ${G^{\{0,1\}^{k+l}}}$ defined by setting ${f_{\omega_k,\omega_l}}$ for ${\omega_k \in \{0,1\}^k, \omega_l \in \{0,1\}^l}$ to equal ${g_{\omega_k}}$ for ${\omega_l \neq (1,\ldots,1)}$, and equal to ${g_{\omega_k} h_{\omega_k}}$ otherwise. Show that ${f \in HK^{k+l}(G_\bullet)}$ if and only if ${g \in HK^k(G_\bullet)}$ and ${h \in HK^k(G_\bullet^{+l})}$, where ${G_\bullet^{+l}}$ is defined in Example 2. (Hint: use Exercises 3, 5.)

Exercise 7 Let ${G_\bullet}$ be a filtered group, let ${k \geq 1}$, and let ${g = (g_\omega)_{\omega \in \{0,1\}^k}}$ be an element of ${G^{\{0,1\}^k}}$. We define the derivative ${\partial_1 g \in G^{\{0,1\}^{k-1}}}$ in the first variable to be the tuple ${(g_{\omega,1} g_{\omega,0}^{-1})_{\omega \in \{0,1\}^{k-1}}}$. Show that ${g \in HK^k(G_\bullet)}$ if and only if the restriction of ${g}$ to ${\{0,1\}^{k-1}}$ lies in ${HK^{k-1}(G_\bullet)}$ and ${\partial_1 g}$ lies in ${HK^k(G_\bullet^{+1})}$, where ${G_\bullet^{+1}}$ is defined in Example 2.

Remark 1 The the Host-Kra groups of a filtered group in fact form a cubic complex, a concept used in topology; but we will not pursue this connection here.

In analogy with Exercise 1, we can now define the general notion of a polynomial map:

Definition 3 A map ${\phi: H \rightarrow G}$ between two filtered groups ${H_\bullet, G_\bullet}$ is said to be polynomial if it maps ${HK^k(H_\bullet)}$ to ${HK^k(G_\bullet)}$ for each ${k \geq 0}$. The space of all such maps is denoted ${\hbox{Poly}(H_\bullet \rightarrow G_\bullet)}$.

Since ${HK^k(H_\bullet), HK^k(G_\bullet)}$ are groups, we immediately obtain

Theorem 4 (Lazard-Leibman theorem) ${\hbox{Poly}(H_\bullet \rightarrow G_\bullet)}$ forms a group under pointwise multiplication.

(From our choice of definitions, this theorem is a triviality, but the theorem is less trivial when using an alternate but non-trivially equivalent definition of a polynomial, which we will give shortly.) In a similar spirit, we have

Theorem 5 (Filtered groups and polynomial maps form a category) If ${\phi: H \rightarrow G}$ and ${\psi: G \rightarrow K}$ are polynomial maps between filtered groups ${H_\bullet, G_\bullet, K_\bullet}$, then ${\psi \circ \phi: H \rightarrow K}$ is also a polynomial map.

We can also give some basic examples of polynomial maps. Any constant map from ${H}$ to ${G}$ taking values in ${G_{\geq 0}}$ is polynomial, as is any map ${\phi: H \rightarrow G}$ which is a filtered homomorphism in the sense that it is a homomorphism from ${H_{\geq i}}$ to ${G_{\geq i}}$ for any ${i \geq 0}$.

Now we turn to an alternate definition of a polynomial map. For any ${h \in H}$ and any map ${\phi: H \rightarrow G}$ Define the multiplicative derivative ${\Delta_h \phi: H \rightarrow G}$ by the formula ${\Delta_h \phi(x) := \phi(hx) \phi(x)^{-1}}$.

Theorem 6 (Alternate description of polynomials) Let ${\phi: H \rightarrow G}$ be a map between two filtered groups ${H, G}$. Then ${\phi}$ is polynomial if and only if, for any ${i_1,\ldots,i_m \geq 0}$, ${x \in H_{\geq 0}}$, and ${h_j \in H_{\geq i_j}}$ for ${j=1,\ldots,m}$, one has ${\Delta_{h_1} \ldots \Delta_{h_m} \phi(x) \in G_{\geq i_1+\ldots+i_m}}$.

In particular, from Exercise 1, we see that a non-classical polynomial of degree ${d}$ from one additive group ${H}$ to another ${G}$ is the same thing as a polynomial map from ${(H,\leq 1)}$ to ${(G,\leq d)}$. More generally, a ${\phi}$ map from ${(H,\leq 1)}$ to a filtered group ${G_\bullet}$ is polynomial if and only if

$\displaystyle \Delta_{h_1} \ldots \Delta_{h_i} \phi(x) \in G_{\geq i}$

for all ${i \geq 0}$ and ${x,h_1,\ldots,h_i \in H}$.

Proof: We first prove the “only if” direction. It is clear (by using ${0}$-dimensional cubes) that a polynomial map must map ${H_{\geq 0}}$ to ${G_{\geq 0}}$. To obtain the remaining cases, it suffices by induction on ${m}$ to show that if ${\phi}$ is polynomial from ${H_\bullet}$ to ${G_\bullet}$, and ${h \in H_{\geq i}}$ for some ${i \geq 0}$, then ${\Delta_h \phi}$ is polynomial from ${H_\bullet}$ to ${G_\bullet^{+i}}$. But this is easily seen from Exercise 7.

Now we establish the “if” direction. We need to show that ${\phi}$ maps ${HK^k(H_\bullet)}$ to ${HK^k(G_\bullet)}$ for each ${k}$. We establish this by induction on ${k}$. The case ${k=0}$ is trivial, so suppose that ${k \geq 1}$ and that the claim has already been estabilshed for all smaller values of ${k}$.

Let ${h \in HK^k(H_\bullet)}$. We split ${H^{\{0,1\}^k}}$ as ${H^{\{0,1\}^{k-1}} \times H^{\{0,1\}^{k-1}}}$. From Exercise 7 we see that we can write ${h = (h_0, h_1 h_0)}$ where ${h_0 \in HK^{k-1}(H_\bullet)}$ and ${h_1 \in HK^{k-1}(H_\bullet^{+1})}$, thus ${\phi(h) = (\phi(h_0), \phi(h_1 h_0))}$ (extending ${\phi}$ to act on ${H^{\{0,1\}^{k-1}}}$ or ${H^{\{0,1\}^k}}$ in the obvious manner). By induction hypothesis, ${\phi(h_0) \in HK^{k-1}(G_\bullet)}$, so by Exercise 7, it suffices to show that ${\phi(h_1 h_0) \phi^{-1}(h_0) \in HK^{k-1}(G_\bullet^{+1})}$.

By telescoping series, it suffices to establish this when ${h_1 = h_F}$ for some face ${F}$ of some dimension ${r}$ in ${\{0,1\}^{k-1}}$ and some ${h \in H_{\geq k-r}}$, as these elements generate ${HK^{k-1}(H_\bullet^{+1})}$. But then ${\phi(h_1 h_0) \phi^{-1}(h_0)}$ vanishes outside of ${F}$ and is equal to ${\Delta_{h_1} \phi(h_0)}$ on ${F}$, so by Exercise 6 it will suffice to show that ${\Delta_{h_1} \phi(h'_0) \in HK^{r}(G_\bullet^{+k-r})}$, where ${h'_0}$ is ${h_0}$ restricted to ${F}$ (which one then identifies with ${\{0,1\}^r}$). But by the induction hypothesis, ${\Delta_{h_1} \phi}$ maps ${HK^r(H_\bullet)}$ to ${HK^r(H_\bullet^{+k-r})}$, and the claim then follows from Exercise 5. $\Box$

Exercise 8 Let ${i_1,\ldots,i_k \geq 0}$ be integers. If ${G_\bullet}$ is a filtered group, define ${HK^{(i_1,\ldots,i_k)}(G_\bullet)}$ to be the subgroup of ${G^{\{0,1\}^k}}$ generated by the elements ${g_F}$, where ${F}$ ranges over all faces of ${\{0,1\}^k}$ and ${g \in G_{\geq i_{j_1}+\ldots+i_{j_r}}}$, where ${1 \leq j_1 < \ldots < j_r \leq k}$ are the coordinates of ${F}$ that are frozen. This generalises the Host-Kra groups ${HK^k(G_\bullet)}$, which correspond to the case ${i_1=\ldots=i_k=1}$. Show that if ${\phi}$ is a polynomial map from ${H_\bullet}$ to ${G_\bullet}$, then ${\phi}$ maps ${HK^{(i_1,\ldots,i_k)}(H_\bullet)}$ to ${HK^{(i_1,\ldots,i_k)}(G_\bullet)}$.

Exercise 9 Suppose that ${\phi: H \rightarrow G}$ is a non-classical polynomial of degree ${\leq d}$ from one additive group to another. Show that ${\phi}$ is a polynomial map from ${(H,\leq m)}$ to ${(G,\leq dm)}$ for every ${m \geq 1}$. Conclude in particular that the composition of a non-classical polynomial of degree ${\leq d}$ and a non-classical polynomial of degree ${\leq d'}$ is a non-classical polynomial of degree ${\leq dd'}$.

Exercise 10 Let ${\phi_1: H \rightarrow G_1}$, ${\phi_2: H \rightarrow G_2}$ be non-classical polynomials of degrees ${\leq d_1}$, ${\leq d_2}$ respectively between additive groups ${H, G_1, G_2}$, and let ${B: G_1 \times G_2 \rightarrow G}$ be a bihomomorphism to another additive group (i.e. ${B}$ is a homomorphism in each variable separately). Show that ${B(\phi_1,\phi_2): H \rightarrow G}$ is a non-classical polynomial of degree ${\leq d_1+d_2}$.

— 2. Nilsequences —

We now specialise the above theory of polynomial maps ${\phi: H \rightarrow G}$ to the case when ${H}$ is just the integers ${{\bf Z} = ({\bf Z},\leq 1)}$ (viewed additively) and ${G}$ is a nilpotent group. Recall that a group ${G}$ is nilpotent of step at most ${s}$ if the ${(s+1)^{th}}$ group ${G_{s+1}}$ in the lower central series vanishes; thus for instance a group is nilpotent of step at most ${1}$ if and only if it is abelian. Analogously, let us call a filtered group ${G_\bullet}$ nilpotent of degree at most ${s}$ if ${G}$ is nilpotent and ${G_{\geq s+1}}$ vanishes. Note that if ${G_{\geq 1} = G}$ and ${G_\bullet}$ is nilpotent of degree at most ${s}$, then ${G}$ is nilpotent of step at most ${s}$. On the other hand, the degree of a filtered group can exceed the step; for instance, given an additive group ${G}$ and an integer ${d \geq 1}$, ${(G,\leq d)}$ has degree ${d}$ but step ${1}$. The step is the traditional measure of nilpotency for groups, but the degree seems to be a more suitable measure in the filtered group category. One is primarily interested in the case when ${G_{\geq 0} = G_{\geq 1} = G}$, but for technical reasons it is occasionally convenient to allow ${G_{\geq 1}}$ to be strictly less than ${G}$, although this does not add much generality (see Exercise 18 below).

We refer to sequences ${g: {\bf Z} \rightarrow G}$ which are polynomial maps from ${({\bf Z},\leq 1)}$ to ${G_\bullet}$ as polynomial sequences or Hall-Petresco sequences adapted to ${G_\bullet}$. The space of all such sequences is denoted ${\hbox{Poly}({\bf Z} \rightarrow G)}$; by the machinery of the previous section, this is a multiplicative group. These sequences can be described explicitly:

Exercise 11 Let ${s \geq 0}$ be an integer, and let ${G_\bullet}$ be a filtered group which is nilpotent of degree ${s}$. Show that a sequence ${g: {\bf Z} \rightarrow G}$ is a Hall-Petresco sequence if and only if one has

$\displaystyle g(n) = g_0 g_1^{\binom{n}{1}} g_2^{\binom{n}{2}} \ldots g_s^{\binom{n}{s}} \ \ \ \ \ (1)$

for all ${n \in {\bf Z}}$ and some ${g_i \in G_{\geq i}}$ for ${i=0,\ldots,s}$, where ${\binom{n}{i} := \frac{n(n-1)\ldots(n-i+1)}{i!}}$. Furthermore, show that the ${g_i}$ are unique. We refer to the ${g_0,\ldots,g_s}$ as the Taylor coefficients of ${g}$ at the origin.

Exercise 12 In a degree ${2}$ nilpotent group ${G}$, establish the formula

$\displaystyle g^n h^n = (gh)^n [g,h]^{-\binom{n}{2}}$

for all ${g,h \in G}$ and ${n \in {\bf Z}}$. This is the first non-trivial case of the Hall-Petresco formula, a discrete analogue of the Baker-Campbell-Hausdorff formula that expresses the polynomial sequence ${n \mapsto g^n h^n}$ explicitly in the form (1).

Define a nilpotent filtered Lie group of degree ${\leq s}$ to be a nilpotent filtered group of degree ${\leq s}$, in which ${G = G_{\geq 0}}$ and all of the ${G_{\geq i}}$ are connected, simply connected finite-dimensional Lie groups. A model example here is the Heisenberg group, which is the degree ${2}$ nilpotent filtered Lie group

$\displaystyle G = G_{\geq 0} = G_{\geq 1} := \begin{pmatrix} 1 \\ 0 & 1 \\ 0 & 0 & 1 \end{pmatrix}$

(i.e. the group of upper-triangular unipotent matrices with arbitrary real entries in the upper triangular positions) with

$\displaystyle G_{\geq 2} := \begin{pmatrix} 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}$

and ${G_{\geq i}}$ trivial for ${i>2}$ (so in this case, ${G_{\geq i}}$ is also the lower central series).

Exercise 13 Show that a sequence

$\displaystyle g(n) = \begin{pmatrix} 1 & x(n) & y(n) \\ 0 & 1 & z(n) \\ 0 & 0 & 1 \end{pmatrix}$

from ${{\bf Z}}$ to the Heisenberg group ${G}$ is a polynomial sequence if and only if ${x, z}$ are linear polynomials and ${z}$ is a quadratic polynomial.

It is a standard fact in the theory of Lie groups that a connected, simply connected nilpotent Lie group ${G}$ is topologically equivalent to its Lie algebra ${{\mathfrak g}}$, with the homeomorphism given by the exponential map ${\exp: {\mathfrak g} \rightarrow G}$ (or its inverse, the logarithm function ${\log: G \rightarrow {\mathfrak g}}$. Indeed, the Baker-Campbell-Hausdorff formula lets one use the nilpotent Lie algebra ${{\mathfrak g}}$ to build a connected, simply connected Lie group with that Lie algebra, which is then necessarily isomorphic to ${G}$. One can thus classify filtered nilpotent Lie groups in terms of filtered nilpotent Lie algebras, i.e. a nilpotent Lie algebras ${{\mathfrak g} = {\mathfrak g}_{\geq 0}}$ together with a nested family of sub-Lie algebras

$\displaystyle {\mathfrak g}_{\geq 0} \geq {\mathfrak g}_{\geq 1} \geq \ldots \geq {\mathfrak g}_{\geq s+1} = \{0\}$

with the inclusions ${[{\mathfrak g}_i, {\mathfrak g}_j] \subset {\mathfrak g}_{i+j}}$ (in which the bracket is now the Lie bracket rather than the commutator). One can describe such filtered nilpotent Lie algebras even more precisely using Mal’cev bases; see these papers of Mal’cev and of Leibman. For instance, in the case of the Heisenberg group, one has

$\displaystyle {\mathfrak g} = {\mathfrak g}_{\geq 0} = {\mathfrak g}_{\geq 1} := \begin{pmatrix} 0 \\ 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}$

and

$\displaystyle {\mathfrak g}_{\geq 2} := \begin{pmatrix} 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}.$

From the filtration property, we see that for ${i \geq 0}$, each ${G_{\geq i+1}}$ is a normal closed subgroup of ${G_{\geq i}}$, and for ${i \geq 1}$, the quotient group ${G_{\geq i+1}/G_{\geq i}}$ is connected, simply connected abelian Lie group (with Lie algebra ${{\mathfrak g}_{\geq i+1}/{\mathfrak g}_{\geq i}}$), and is thus isomorphic to a vector space (with the additive group law). Related to this, one can view ${G = G_{\geq 0}}$ as a group extension of the quotient group ${G/G_{\geq s}}$ (with the degree ${s-1}$ filtration ${(G_{\geq i}/G_{\geq s})}$) by the central vector space ${G_{\geq s}}$. Thus one can view degree ${s}$ filtered nilpotent groups as an ${s}$-fold iterated tower of central extensions by finite-dimensional vector spaces starting from the base space ${G/G_{\geq 1}}$ (which is a point in the most important case ${G_{\geq 1}=G}$); for instance, the Heisenberg group is an extension of ${{\bf R}^2}$ by ${{\bf R}}$.

We thus see that nilpotent filtered Lie groups are generalisations of vector spaces (which correspond to the degree ${1}$ case). We now turn to filtered nilmanifolds, which are generalisations of tori. A degree ${s}$ filtered nilmanifold ${G/\Gamma = (G/\Gamma, G_\bullet, \Gamma)}$ is a filtered degree ${s}$ nilpotent Lie group ${G_\bullet}$, together with a discrete subgroup ${\Gamma}$ of ${G}$, such that all the subgroups ${G_{\geq i}}$ in the filtration are rational relative to ${\Gamma}$, which means that the subgroup ${\Gamma_{\geq i} := \Gamma \cap G_{\geq i}}$ is a cocompact subgroup of ${G_{\geq i}}$ (i.e. the quotient space ${G_{\geq i}/\Gamma_{\geq i}}$ is cocompact, or equivalently one can write ${G_{\geq i} = \Gamma_{\geq i} \cdot K_{\geq i}}$ for some compact subset ${K_{\geq i}}$ of ${G_{\geq i}}$. Note that the subgroups ${\Gamma_{\geq i}}$ give ${\Gamma}$ the structure of a degree ${s}$ filtered nilpotent group ${\Gamma_\bullet}$.

Exercise 14 Let ${G:={\bf R}^2}$ and ${\Gamma := {\bf Z}^2}$, and let ${\alpha \in {\bf R}}$. Show that the subgroup ${\{ (x,\alpha x): x \in {\bf R} \}}$ of ${G}$ is rational relative to ${\Gamma}$ if and only if ${\alpha}$ is a rational number; this may help explain the terminology “rational”.

By hypothesis, the quotient space ${G/\Gamma = G_{\geq 0}/\Gamma_{\geq 0}}$ is a smooth compact manifold. The space ${G_{\geq s}/\Gamma_{\geq s}}$ is a compact connected abelian Lie group, and is thus a torus; the degree ${s}$ filtered nilmanifold ${G/\Gamma}$ can then be viewed as a principal torus bundle over the degree ${s-1}$ filtered nilmanifold ${G/(G_{\geq s} \Gamma)}$ with ${G_{\geq s}/\Gamma_{\geq s}}$ as the structure group; thus one can view degree ${s}$ filtered nilmanifolds as an ${s}$-fold iterated tower of torus extensions starting from ${G/(G_{\geq 1}\Gamma)}$, which is a point in the most important case ${G_{\geq 1} = G}$. For instance, the Heisenberg nilmanifold

$\displaystyle G/\Gamma := \begin{pmatrix} 1 \\ 0 & 1 \\ 0 & 0 & 1 \end{pmatrix} / \begin{pmatrix} 1 \\ 0 & 1 \\ 0 & 0 & 1 \end{pmatrix}$

is an extension of the two-dimensional torus ${{\bf R}^2/{\bf Z}^2}$ by the circle ${{\bf R}/{\bf Z}}$.

Every torus of some dimension ${d}$ can be viewed as a unit cube ${[0,1]^d}$ with opposite faces glued together; up to measure zero sets, the cube then serves as a fundamental domain for the nilmanifold. Nilmanifolds can be viewed the same way, but the gluing can be somewhat “twisted”:

Exercise 15 Let ${G/\Gamma}$ be the Heisenberg nilmanifold. If we abbreviate

$\displaystyle [x,y,z] := \begin{pmatrix} 1 & x & y \\ 0 & 1 & z \\ 0 & 0 & 1 \end{pmatrix} \Gamma \in G/\Gamma$

for all ${x,y,z \in {\bf R}}$, show that for almost all ${x,y,z}$, that ${[x,y,z]}$ has exactly one representation of the form ${[a,b,c]}$ with ${a,b,c \in [0,1]}$, which is given by the identity

$\displaystyle [x,y,z] = [ \{x\}, \{ y - x \lfloor z \rfloor \}, \{z\} ]$

where ${\lfloor x \rfloor}$ is the greatest integer part of ${x}$, and ${\{x\} := x - \lfloor x \rfloor \in [0,1)}$ is the fractional part function. Conclude that ${G/\Gamma}$ is topologically equivalent to the unit cube ${[0,1]^3}$ quotiented by the identifications

$\displaystyle (0,y,z) \sim (1,y,z)$

$\displaystyle (x,0,z) \sim (x,1,z)$

$\displaystyle (x,y,0) \sim (x,\{y-x\},1)$

between opposite faces.

Note that by using the projection ${(x,y,z) \mapsto (x,z)}$, we can view the Heisenberg nilmanifold ${G/\Gamma}$ as a twisted circle bundle over ${({\bf R}/{\bf Z})^2}$, with the fibers being isomorphic to the unit circle ${{\bf R}/{\bf Z}}$. Show that ${G/\Gamma}$ is not homeomorphic to ${({\bf R}/{\bf Z})^3}$. (Hint: show that there are some non-trivial homotopies between loops that force the fundamental group of ${G/\Gamma}$ to be smaller than ${{\bf Z}^3}$.)

The logarithm ${\log(\Gamma)}$ of the discrete cocompact subgroup ${\Gamma}$ can be shown to be a lattice of the Lie algebra ${{\mathfrak g}}$. After a change of basis, one can thus view the latter algebra as a standard vector space ${{\bf R}^d}$ and the lattice as ${{\bf Z}^d}$. Denoting the standard generators of the lattice (and the standard basis of ${{\bf R}^d}$) as ${e_1,\ldots,e_d}$, we then see that the Lie bracket ${[e_i,e_j]}$ of two such generators must be an integer combination of more generators:

$\displaystyle [e_i,e_j] = \sum_{k=1}^d c_{ijk} e_k.$

The structure constants ${c_{ijk}}$ describe completely the Lie group structure of ${G}$ and ${\Gamma}$. The rational subgroups ${G_{\geq l}}$ can also be described by picking some generators for ${\log(\Gamma_{\geq i})}$, which are integer combinations of the ${e_1,\ldots,e_d}$. We say that the filtered nilmanifold has complexity at most ${M}$ if the dimension and degree is at most ${M}$, and the structure constants and coefficients of the generators also have magnitude at most ${M}$. This is an admittedly artificial definition, but for quantitative applications it is necessary to have some means to quantify the complexity of a nilmanifold.

A polynomial orbit in a filtered nilmanifold ${G/\Gamma}$ is a map ${{\mathcal O}: {\bf Z} \rightarrow G/\Gamma}$ of the form ${{\mathcal O}(n) := g(n) \Gamma}$, where ${g: {\bf Z} \rightarrow G}$ is a polynomial sequence. For instance, any linear orbit ${{\mathcal O}(n) = g^n x}$, where ${x \in G/\Gamma}$ and ${g \in G}$, is a polynomial orbit. The space of

Exercise 16 For any ${\alpha,\beta \in {\bf R}}$, show that the sequence

$\displaystyle n \mapsto [ \{ -\alpha n \}, \{ \alpha n \lfloor \beta n \rfloor \}, \{ \beta n \} ]$

(using the notation from Exercise 15) is a polynomial sequence in the Heisenberg nilmaniofold.

With the above example, we see the emergence of bracket polynomials when representing polynomial orbits in a fundamental domain. Indeed, one can view the entire machinery of orbits in nilmanifolds as a means of efficiently capturing such polynomials in an algebraically tractable framework (namely, that of polynomial sequences in nilpotent groups). The piecewise continuous nature of the bracket polynomials is then ultimately tied to the twisted gluing needed to identify the fundamental domain with the nilmanifold.

Finally, we can define the notion of a (basic Lipschitz) nilsequence of degree ${\leq s}$. This is a sequence ${\psi: {\bf Z} \rightarrow {\bf C}}$ of the form ${\psi(n) := F( {\mathcal O}(n) )}$, where ${{\mathcal O}: {\bf Z} \rightarrow G/\Gamma}$ is a polynomial orbit in a filtered nilmanifold of degree ${\leq s}$, and ${F: G/\Gamma \rightarrow {\bf C}}$ is a Lipschitz function. (One needs a metric on ${G/\Gamma}$ to define the Lipschitz constant, but this can be done for instance by using a basis ${e_1,\ldots,e_d}$ of ${\Gamma}$ to identify ${G/\Gamma}$ with a fundamental domain ${[0,1]^d}$, and using this to construct some (artificial) metric on ${G/\Gamma}$. The details of such a construction will not be important here.) We say that the nilsequence has complexity at most ${M}$ if the filtered nilmanifold has complexity at most ${M}$, and the (inhomogeneous Lipschitz norm) of ${F}$ is also at most ${M}$.

A basic example of a degree ${\leq s}$ nilsequence is a polynomial phase ${n \mapsto e( P(n) )}$, where ${P: {\bf Z} \rightarrow {\bf R}/{\bf Z}}$ is a polynomial of degree ${\leq s}$. A bit more generally, ${n \mapsto F(P(n))}$ is a degree ${\leq s}$ sequence, whenever ${F: {\bf R}/{\bf Z} \rightarrow {\bf C}}$ is a Lipschitz function. In view of Exercises 15, 16, we also see that

$\displaystyle n \mapsto e( \alpha n \lfloor \beta n \rfloor ) \psi( \{ \alpha n \} ) \psi( \{ \beta n \} ) \ \ \ \ \ (2)$

or more generally

$\displaystyle n \mapsto F( \alpha n \lfloor \beta n \rfloor ) \psi( \{ \alpha n \} ) \psi( \{ \beta n \} )$

are also degree ${\leq 2}$ nilsequences, where ${\psi: [0,1] \rightarrow {\bf C}}$ is a Lipschitz function that vanishes near ${0}$ and ${1}$. The ${\psi(\{\alpha n \})}$ factor is not needed (as there is no twisting in the ${x}$ coordinate in Exercise 15), but the ${\psi(\{\beta n\})}$ factor is (unfortunately) necessary, as otherwise one encounters the discontinuity inherent in the ${\lfloor \beta n \rfloor}$ term (and one would merely have a piecewise Lipschitz nilsequence rather than a genuinely Lipschitz nilsequence). Because of this discontinuity, bracket polynomial phases ${n \mapsto e( \alpha n \lfloor \beta n \rfloor )}$ cannot quite be viewed as Lipschitz nilsequences, but from a heuristic viewpoint it is often helpful to pretend as if bracket polynomial phases are model instances of nilsequences.

The only degree ${\leq 0}$ nilsequences are the constants. The degree ${\leq 1}$ nilsequences are essentially the quasiperiodic functions:

Exercise 17 Show that a degree ${\leq 1}$ nilsequence of complexity ${M}$ is Fourier-measurable with growth function ${{\mathcal F}_M}$ depending only on ${M}$, where Fourier measurability was defined in Notes 2.

Exercise 18 Show that the class of nilsequences of degree ${\leq s}$ does not change if we drop the condition ${G=G_{\leq 0}}$, or if we add the additional condition ${G=G_{\leq 1}}$.

Remark 2 The space of nilsequences is also unchanged if one insists that the polynomial orbit be linear, and that the filtration be the lower central series filtration; and this is in fact the original definition of a nilsequence. The proof of this equivalence is a little tricky, though, and will appear in a forthcoming paper of Green, Ziegler, and myself.

— 3. Connection with the Gowers norms —

We define the Gowers norm ${\|f\|_{U^d[N]}}$ of a function ${f: [N] \rightarrow {\bf C}}$ by the formula

$\displaystyle \|f\|_{U^d[N]} := \|f\|_{U^d({\bf Z}/N'{\bf Z})} / \|1_{[N]}\|_{U^d({\bf Z}/N'{\bf Z})}$

where ${N'}$ is any integer greater than ${(d+1)N}$, ${[N]}$ is embedded inside ${{\bf Z}/N'{\bf Z}}$, and ${f}$ is extended by zero outside of ${[N]}$. It is easy to see that this definition is independent of the choice of ${N'}$. Note also that the normalisation factor ${\|1_{[N]}\|_{U^d({\bf Z}/N'{\bf Z})}}$ is comparable to ${1}$ when ${d}$ is fixed and ${N'}$ is comparable to ${N}$.

One of the main reasons why nilsequences are relevant to the theory of the Gowers norms is that they are an obstruction to that norm being small. More precisely, we have

Theorem 7 (Converse to the inverse conjecture for the Gowers norms) Let ${f: [N] \rightarrow {\bf C}}$ be such that ${\|f\|_{L^\infty[N]} \leq 1}$ and ${|\langle f, \psi \rangle_{L^2([N])}| \geq \delta}$ for some degree ${\leq s}$ nilsequence of complexity at most ${M}$. Then ${\|f\|_{U^{s+1}[N]} \gg_{s,\delta,M} 1}$.

We now prove this theorem, following an argument of Green, Ziegler, and myself. It is convenient to introduce a few more notions. Define a vertical character of a degree ${\leq s}$ filtered nilmanifold ${G/\Gamma}$ to be a continuous homomorphism ${\eta: G_{\geq s} \rightarrow {\bf R}/{\bf Z}}$ that annihilates ${\Gamma_{\geq s}}$, or equivalently an element of the Pontryagin dual ${\widehat{G_{\geq s}/\Gamma_{\geq s}}}$ of the torus ${G_{\geq s}/\Gamma_{\geq s}}$. A function ${F: G/\Gamma \rightarrow {\bf C}}$ is said to have vertical frequency ${\eta}$ if ${F}$ obeys the equation

$\displaystyle F(g_s x) = e( \eta(g_s) ) F(x)$

for all ${g_s \in G_{\geq s}}$ and ${x \in G/\Gamma}$. A degree ${\leq s}$ nilsequence is said to have a vertical frequency if it can be represented in the form ${n \mapsto F({\mathcal O}(n))}$ for some Lipschitz ${F}$ with a vertical frequency.

For instance, a polynomial phase ${n \mapsto e(P(n))}$, where ${P: {\bf Z} \rightarrow {\bf R}/{\bf Z}}$ is a polynomial of degree ${\leq s}$, is a degree ${\leq s}$ nilsequence with a vertical frequency. Any nilsequence of degree ${\leq s-1}$ is trivially a nilsequence of degree ${\leq s}$ with a vertical frequency of ${0}$. Finally, observe that the space of degree ${\leq s}$ nilsequences with a vertical frequency is closed under multiplication and complex conjugation.

Exercise 19 Show that a degree ${\leq 1}$ nilsequence with a vertical frequency necessarily takes the form ${\psi(n) = c e( \alpha n)}$ for some ${c \in {\bf C}}$ and ${\alpha \in {\bf R}}$ (and conversely, all such sequences are degree ${\leq 1}$ nilsequences with a vertical frequency). Thus, up to constants, degree ${\leq 1}$ nilsequences with a vertical frequency are the same as Fourier characters.

A basic fact (generalising the invertibility of the Fourier transform in the degree ${\leq 1}$ case) is that the nilsequences with vertical frequency generate all the other nilsequences:

Exercise 20 Show that any degree ${\leq s}$ nilsequence can be approximated to arbitrary accuracy in the uniform norm by a linear combination of nilsequences with a vertical frequency. (Hint: use the Stone-Weierstrass theorem.)

More quantitatively, show that a degree ${\leq s}$ nilsequence of complexity ${\leq M}$ can be approximated uniformly to error ${\epsilon}$ by a sum of ${O_{M,\epsilon,s}(1)}$ nilsequences, each with a representation with a vertical frequency that is of complexity ${O_{M,\epsilon,s}(1)}$. (Hint: this can be deduced from the qualitative result by a compactness argument using the Arzelá-Ascoli theorem.)

A derivative ${\Delta_h e(P(n))}$ of a polynomial phase is a polynomial phase of one lower degree. There is an analogous fact for nilsequences with a vertical frequency:

Lemma 8 (Differentiating nilsequences with a vertical frequency) Let ${s \geq 1}$, and let ${\psi}$ be a degree ${\leq s}$ nilsequence with a vertical frequency. Then for any ${h \in {\bf Z}}$, ${\Delta_h \psi}$ is a degree ${\leq s-1}$ nilsequence. Furthermore, if ${\psi}$ has complexity ${\leq M}$ (with a vertical frequency representation), then ${\Delta_h \psi}$ has complexity ${O_{M,s}(1)}$.

Proof: We just prove the first claim, as the second claim follows by refining the argument.

We write ${\psi = F( g(n) \Gamma )}$ for some polynomial sequence ${g: {\bf Z} \rightarrow G/\Gamma}$ and some Lipschitz function ${F}$ with a vertical frequency. We then express

$\displaystyle \Delta_h \psi(n) = \tilde F( \tilde g(n) (\Gamma \times \Gamma))$

where ${\tilde F: G \times G/(\Gamma \times \Gamma) \rightarrow {\bf C}}$ is the function

$\displaystyle \tilde F(x,y) := F(x) \overline{F(y)}$

and ${\tilde g: {\bf Z} \rightarrow G \times G}$ is the sequence

$\displaystyle \tilde g(n) := (g(n), \partial_h g(n) g(n)).$

Now we give a filtration on ${G \times G}$ by setting

$\displaystyle (G \times G)_{\geq j} := G_{\geq j} \times_{G_{\geq j+1}} G_{\geq j}$

for ${j \geq 0}$, where ${G_{\geq j} \times_{G_{\geq j+1}} G_{\geq j}}$ is the subgroup of ${G_{\geq j} \times_{G_{\geq j+1}} G_{\geq j}}$ generated by ${G_{\geq j+1} \times G_{\geq j+1}}$ and the diagonal group ${G_{\geq j}^\Delta := \{ (g_j,g_j): g_j \in G_{\geq j}}$. One easily verifies that this is a filtration on ${G \times G}$. The sequences ${(g(n), g(n))}$ and ${(\hbox{id}, \partial_h g(n))}$ are both polynomial with respect to this filtration, and hence by the Lazard-Leibman theorem, ${\tilde g}$ is polynomial also.

Next, we use the hypothesis that ${F}$ has a vertical frequency to conclude that ${F}$ is invariant with respect to the action of the diagonal group ${G_s^\Delta = (G \times G)_{\geq s}}$. If we then define ${G^\Box}$ to be the Lie group ${G^\Box := (G \times G)_{\geq 0}/G_s^\Delta}$ with filtration ${G^\Box_{\geq j} := (G \times G)_{\geq j} / G_s^\Delta}$, then ${G^\Box}$ is a degree ${\leq s-1}$ filtered nilpotent Lie group; setting ${\Gamma^\Box := (\Gamma \times\Gamma) \cap G^\Box}$, we conclude that ${G^\Box/\Gamma^\Box}$ is a degree ${\leq s-1}$ nilmanifold and

$\displaystyle \Delta_h \psi(n) = F^\Box(g^\Box(n) \Gamma^\Box)$

where ${F^\Box, g^\Box}$ are the projections of ${\tilde F, \tilde g}$ from ${G \times G}$ to ${G^\Box}$. The claim follows. $\Box$

We now prove Theorem 7 by induction on ${s}$. The claim is trivial for ${s=0}$, so we assume that ${s \geq 1}$ and that the claim has already been proven for smaller values of ${s}$.

Let ${f, \delta, \psi}$ be as in Theorem 7. From Exercise 20 we see (after modifying ${\delta, M}$) that we may assume that ${\psi}$ has a vertical frequency. Next, we use the identity

$\displaystyle |\mathop{\bf E}_{n \in {\bf Z}/N{\bf Z}'} f(n) \overline{\psi(n)}|^2 = \mathop{\bf E}_{h \in {\bf Z}/N'{\bf Z}} \mathop{\bf E}_{n\in {\bf Z}/N'{\bf Z}} \Delta_h f(n) \overline{\Delta_h \psi(n)}$

(extending ${f}$ by zero outside of ${[N]}$, and extending ${\psi}$ arbitrarily) to conclude that

$\displaystyle |\mathop{\bf E}_{n \in [N]} \Delta_h f(n) \overline{\Delta_h \psi(n)}| \gg_\delta 1$

for ${\gg N}$ values of ${h \in [-N,N]}$. By induction hypothesis and Lemma 8, we conclude that

$\displaystyle \| \Delta_h f\|_{U^s[N]} \gg_{\delta, M} 1$

for ${\gg N}$ values of ${h \in [-N,N]}$. Using the identity

$\displaystyle \|f\|_{U^{s+1}({\bf Z}/N'{\bf Z})}^{2^{s+1}} = \mathop{\bf E}_{h \in {\bf Z}/N'{\bf Z}} \|\Delta_h f\|_{U^s({\bf Z}/N'{\bf Z})}^{2^s}$

we close the induction and obtain the claim.

In the other direction, we have

Theorem 9 (Inverse conjecture for the Gowers norms on ${{\bf Z}}$) Let ${f: [N] \rightarrow {\bf C}}$ be such that ${\|f\|_{L^\infty[N]} \leq 1}$ and ${\|f\|_{U^{s+1}[N]} \geq \delta}$. Then ${|\langle f, \psi \rangle_{L^2([N])}| \gg_{s,\delta} 1}$ for some degree ${\leq s}$ nilsequence of complexity ${O_{s,\delta}(1)}$.

This conjecture has recently been proven by Green, Ziegler, and myself; an announcement of this result, which will contain extensive heuristic discussion of how this conjecture is proven, will appear very shortly, and the paper itself soon after that. For a discussion of the history of the conjecture, including the cases ${s \leq 3}$, see our previous paper.

Exercise 21 (${99\%}$ inverse theorem)

1. (Straightening an approximately linear function) Let ${\epsilon, \kappa > 0}$. Let ${\xi: [-N,N] \rightarrow {\bf R}/{\bf Z}}$ be a function such that ${|\xi(a+b) - \xi(a)-\xi(b)| \leq \kappa}$ for all but ${\epsilon N^2}$ of all ${a, b \in [-N,N]}$ with ${a+b \in [-N,N]}$. If ${\epsilon}$ is sufficiently small, show that there exists an affine linear function ${n \mapsto \alpha n + \beta}$ with ${\alpha, \beta \in {\bf R}/{\bf Z}}$ such that ${|\xi(n) - \alpha n - \beta| \ll_\epsilon \kappa}$ for all but ${\delta(\epsilon) N}$ values of ${n \in [-N,N]}$, where ${\delta(\epsilon) \rightarrow 0}$ as ${\epsilon \rightarrow 0}$. (Hint: One can take ${\kappa}$ to be small. First find a way to lift ${\xi}$ in a nice manner from ${{\bf R}/{\bf Z}}$ to ${{\bf R}}$.)
2. Let ${f: [N] \rightarrow {\bf C}}$ be such that ${\|f\|_{L^\infty[N]} \leq 1}$ and ${\|f\|_{U^{s+1}[N]} \geq 1-\epsilon}$. Show that there exists a polynomial ${P: {\bf Z} \rightarrow {\bf R}/{\bf Z}}$ of degree ${\leq s}$ such that ${\|f-e(P)\|_{L^2([N])} \leq \delta}$, where ${\delta = \delta_s(\epsilon) \rightarrow 0}$ as ${\epsilon \rightarrow 0}$ (holding ${s}$ fixed). Hint: Adapt the argument of the analogous finite field statement. One cannot exploit the discrete nature of polynomials any more; and so one must use the preceding part of the exercise as a substitute.

The inverse conjecture for the Gowers norms, when combined with the equidistribution theory for nilsequences that we will turn to next, has a number of consequences, analogous to the consequences for the finite field analogues of these facts; see this paper of Green and myself for further discussion.

— 4. Equidistribution of nilsequences —

In the subject of higher order Fourier analysis, and in particular in the proof of the inverse conjecture for the Gowers norms, as well as in several of the applications of this conjecture, it will be of importance to be able to compute statistics of nilsequences ${\psi}$, such as their averages ${\mathop{\bf E}_{n \in [N]} \psi(n)}$ for a large integer ${N}$; this generalises the computation of exponential sums such as ${\mathop{\bf E}_{n \in [N]} e(P(n))}$ that occurred in Notes 1. This is closely related to the equidistribution of polynomial orbits ${{\mathcal O}: {\bf Z} \rightarrow G/\Gamma}$ in nilmanifolds. Note that as ${G/\Gamma}$ is a compact quotient of a locally compact group ${G}$, it comes endowed with a unique left-invariant Haar measure ${\mu_{G/\Gamma}}$ (which is isomorphic to the Lebesgue measure on a fundamental domain ${[0,1]^d}$ of that nilmanifold). By default, when we talk about equidistribution in a nilmanifold, we mean with respect to the Haar measure; thus ${{\mathcal O}}$ is asymptotically equidistributed if and only if

$\displaystyle \lim_{N \rightarrow \infty} \mathop{\bf E}_{n \in [N]} F({\mathcal O}(n)) = 0$

for all Lipschitz ${F: G/\Gamma \rightarrow {\bf C}}$. One can also describe single-scale equidistribution (and non-standard equidistribution) in a similar fashion, but for sake of discussion let us restrict attention to the simpler and more classical situation of asymptotic equidistribution here (although it is the single-scale equidistribution theory which is ultimately relevant to questions relating to the Gowers norms).

When studying equidistribution of polynomial sequences in a torus ${{\bf T}^d}$, a key tool was the van der Corput lemma. This lemma asserts that if a sequence ${x: {\bf Z} \rightarrow {\bf T}^d}$ is such that all derivatives ${\partial_h x: {\bf Z} \rightarrow {\bf T}^d}$ with ${h \neq 0}$ are asymptotically equidistributed, then ${x}$ itself is also asymptotically equidistributed.

The notion of a derivative requires the ability to perform subtraction on the range space ${{\bf T}^d}$: ${\partial_h x(n+h) - \partial_h x(n)}$. When working in a higher degree nilmanifold ${G/\Gamma}$, which is not a torus, we do not have a notion of subtraction. However, such manifolds are still torus bundles with torus ${{\bf T} := G_{\geq s}/\Gamma_{\geq s}}$. This gives a weaker notion of subtraction, namely the map ${\pi: G/\Gamma \times G/\Gamma \rightarrow (G/\Gamma \times G/\Gamma) / {\bf T}^\Delta}$, where ${{\bf T}^\Delta}$ is the diagonal action ${g_s: (x,y) \mapsto (g_s x, g_s y)}$ of the torus ${{\bf T}}$ on the product space ${G/\Gamma \times G/\Gamma}$. This leads to a generalisation of the van der Corput lemma:

Lemma 10 (Relative van der Corput lemma) Let ${x: {\bf Z} \rightarrow G/\Gamma}$ be a sequence in a degree ${\leq s}$ nilmanifold for some ${s \geq 1}$. Suppose that the projection of ${x}$ to the degree ${\leq s-1}$ filtered nilmanifold ${G/G_s \Gamma}$ is asymptotically equidistributed, and suppose also that for each non-zero ${h \in {\bf Z}}$, the sequence ${\partial_h x: n \mapsto \pi( x(n+h), x(n) )}$ is asymptotically equidistributed with respect to some ${{\bf T}}$-invariant measure ${\mu_h}$ on ${(G/\Gamma \times G/\Gamma) / {\bf T}^\Delta}$. Then ${x}$ is asymptotically equidistributed in ${G/\Gamma}$.

Proof: It suffices to show that, for each Lipschitz function ${F: G/\Gamma \rightarrow {\bf C}}$, that

$\displaystyle \lim_{n \rightarrow \infty} \mathop{\bf E}_{n \in [N]} F(x(n)) = \int_{G/\Gamma} F\ d\mu_{G/\Gamma}.$

By Exercise 20, we may assume that ${F}$ has a vertical frequency. If this vertical frequency is non-zero, then ${F}$ descends to a function on the degree ${\leq s-1}$ filtered nilmanifold ${G/G_s \Gamma}$, and the claim then follows from the equidistribution hypothesis on this space. So suppose instead that ${F}$ has a non-zero vertical frequency. By vertically rotating ${F}$ (and using the ${G_s}$-invariance of ${\mu_{G/\Gamma}}$ we conclude that ${\int_{G/\Gamma} F \mu_{G/\Gamma} = 0}$. Applying the van der Corput inequality (see Notes 1), we now see that it suffices to show that

$\displaystyle \lim_{n \rightarrow \infty} \mathop{\bf E}_{n \in [N]} F(x(n+h)) \overline{F(x(n))} = 0$

for each non-zero ${h}$. The function ${(x,y) \rightarrow F(x) \overline{F(y)}}$ on ${G/\Gamma \times G/\Gamma}$ is ${T^\Delta}$-invariant (because of the vertical frequency hypothesis) and so descends to a function ${\tilde F}$ on ${(G/\Gamma \times G/\Gamma)/T^\Delta}$. We thus have

$\displaystyle \lim_{n \rightarrow \infty} \mathop{\bf E}_{n \in [N]} F(x(n+h)) \overline{F(x(n))} = \int_{(G/\Gamma \times G/\Gamma)/T^\Delta} \tilde F\ d\mu_h.$

The function ${\tilde F}$ has a non-zero vertical frequency with respect to the residual action of ${{\bf T}}$ (or more precisely, of ${({\bf T} \times {\bf T})/{\bf T}^\Delta}$, which is isomorphic to ${{\bf T}}$). As ${\mu_h}$ is invariant with respect to this action, the integral thus vanishes, as required. $\Box$

This gives a useful criterion for equidistribution of polynomial orbits. Define a horizontal character to be a continuous homomorphism ${\eta}$ from ${G}$ to ${{\bf R}/{\bf Z}}$ that annihilates ${\Gamma}$ (or equivalently, an element of the Pontryagin dual of the horizontal torus ${G/([G,G]\Gamma)}$). This is easily seen to be a torus. Let ${\pi_i: G_{\geq i} \rightarrow {\bf T}_i}$ be the projection map.

Theorem 11 (Leibman equidistribution criterion) Let ${{\mathcal O}: n \mapsto g(n) \Gamma}$ be a polynomial orbit on a degree ${\leq s}$ filtered nilmanifold ${G/\Gamma}$. Suppose that ${G=G_{\geq 0}=G_{\geq 1}}$. Then ${{\mathcal O}}$ is asymptotically equidistributed in ${G/\Gamma}$ if and only if ${\eta \circ g}$ is non-constant for each non-trivial horizontal character.

This theorem was first established by Leibman (by a slightly different method), and also follows from the above van der Corput lemma and some tedious additional computations; see this paper of Green and myself for details. For linear orbits, this result was established by Parry and by Leon Green. Using this criterion (together with more quantitative analogues for single-scale equidistribution), one can develop Ratner-type decompositions that generalise those in (Notes 1). Again, the details are technical and I refer to my paper with Green for details. We give a special case of Theorem 11 as an exercise:

Exercise 22 Use Lemma 10 to show that if ${\alpha, \beta}$ are two real numbers such that ${\alpha, \beta, \alpha \beta}$ are linearly independent modulo ${1}$ over the integers, then the polynomial orbit

$\displaystyle n \mapsto \begin{pmatrix} 1 & \alpha n & 0 \\ 0 & 1 & \beta n \\ 0 & 0 & 1 \end{pmatrix} \Gamma$

is asymptotically equidistributed in the Heisenberg nilmanifold ${G/\Gamma}$; note that this is a special case of Theorem 11. Conclude that the map ${n \mapsto \alpha n \lfloor \beta n \rfloor \hbox{ mod } 1}$ is asymptotically equidistributed in the unit circle.

Unfortunately Lemma 10 is not strong enough to cover all cases of Theorem 11; in particular, if ${\alpha,\beta}$ are independent but ${\alpha,\beta,\alpha \beta}$ are not, then the hypotheses of Lemma 10 are not obeyed for any fixed non-zero ${h}$, although they are in some sense asymptotically obeyed in the limit when ${h}$ is large. To obtain Theorem 11 in this case one either needs a quantitative (single-scale) version of Lemma 10, or else one has to invoke the ergodic theorem in a number of places. The former approach is the one taken in the above mentioned paper of Green and myself, and the latter in the paper of Leibman.

One application of this equidistribution theory is to show that bracket polynomial objects such as (2) have a negligible correlation with any genuinely quadratic phase ${n \mapsto e(\alpha n^2 + \beta n + \gamma)}$ (or more generally, with any genuinely polynomial phase of bounded degree); this result was first established by Haland. On the other hand, from Theorem 7 we know that (2) has a large ${U^3[N]}$ norm. This shows that even when ${s=2}$, one cannot invert the Gowers norm purely using polynomial phases. This observation first appeared in the work of Gowers (with a related observation due to Furstenberg and Weiss).

Exercise 23 Let the notation be as in Exercise 22. Show that

$\displaystyle \lim_{n \rightarrow\infty} \mathop{\bf E}_{n \in [N]} e( \alpha n \lfloor \beta n \rfloor - \gamma n^2 - \delta n ) = 0$

for any ${\gamma, \delta \in {\bf R}}$. (You can either apply Theorem 11, or go back to Lemma 10.)