You are currently browsing the tag archive for the ‘Gowers uniformity norms’ tag.

This week I have been at a Banff workshop “Combinatorics meets Ergodic theory“, focused on the combinatorics surrounding Szemerédi’s theorem and the Gowers uniformity norms on one hand, and the ergodic theory surrounding Furstenberg’s multiple recurrence theorem and the Host-Kra structure theory on the other. This was quite a fruitful workshop, and directly inspired the various posts this week on this blog. Incidentally, BIRS being as efficient as it is, videos for this week’s talks are already online.

As mentioned in the previous two posts, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms) Let ${N \geq 1}$ and ${s \geq 1}$ be integers, and let ${\delta > 0}$. Suppose that ${f: {\bf Z} \rightarrow [-1,1]}$ is a function supported on ${[N] := \{1,\dots,N\}}$ such that

$\displaystyle \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1} \in {\bf Z}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.$

Then there exists a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta}(1)}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta}(1)}$ such that

$\displaystyle \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.$

There is a higher dimensional generalisation, which first appeared explicitly (in a more general form) in this preprint of Szegedy (which used a slightly different argument than the one of Ben, Tammy, and myself; see also this previous preprint of Szegedy with related results):

Theorem 2 (Inverse theorem for multidimensional Gowers norms) Let ${N \geq 1}$ and ${s,d \geq 1}$ be integers, and let ${\delta > 0}$. Suppose that ${f: {\bf Z}^d \rightarrow [-1,1]}$ is a function supported on ${[N]^d}$ such that

$\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n,h_1,\dots,h_{s+1} \in {\bf Z}^d} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta. \ \ \ \ \ (1)$

Then there exists a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta,d}(1)}$, a polynomial sequence ${g: {\bf Z}^d \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta,d}(1)}$ such that

$\displaystyle \frac{1}{N^d} \sum_{n \in {\bf Z}^d} f(n) F(g(n) \Gamma) \gg_{s,\delta,d} 1.$

The ${d=2}$ case of this theorem was recently used by Wenbo Sun. One can replace the polynomial sequence with a linear sequence if desired by using a lifting trick (essentially due to Furstenberg, but which appears explicitly in Appendix C of my paper with Ben and Tammy).

In this post I would like to record a very neat and simple observation of Ben Green and Nikos Frantzikinakis, that uses the tool of Freiman isomorphisms to derive Theorem 2 as a corollary of the one-dimensional theorem. Namely, consider the linear map ${\phi: {\bf Z}^d \rightarrow {\bf Z}}$ defined by

$\displaystyle \phi( n_1,\dots,n_d ) := \sum_{i=1}^d (10 N)^{i-1} n_i,$

that is to say ${\phi}$ is the digit string base ${10N}$ that has digits ${n_d \dots n_1}$. This map is a linear map from ${[N]^d}$ to a subset of ${[d 10^d N^d]}$ of density ${1/(d10^d)}$. Furthermore it has the following “Freiman isomorphism” property: if ${n, h_1,\dots,h_{s+1}}$ lie in ${{\bf Z}}$ with ${n + \omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}}$ in the image set ${\phi( [N]^d )}$ of ${[N]^d}$ for all ${\omega}$, then there exist (unique) lifts ${\tilde n \in {\bf Z}^d, \tilde h_1,\dots,\tilde h_{s+1} \in {\bf Z}}$ such that

$\displaystyle \tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1} \in [N]^d$

and

$\displaystyle \phi( \tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1} ) = n + \omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}$

for all ${\omega}$. Indeed, the injectivity of ${\phi}$ on ${[N]^d}$ uniquely determines the sum ${\tilde n + \omega_1 \tilde h_1 + \dots + \omega_{s+1} \tilde h_{s+1}}$ for each ${\omega}$, and one can use base ${10N}$ arithmetic to verify that the alternating sum of these sums on any ${2}$-facet of the cube ${\{0,1\}^{s+1}}$ vanishes, which gives the claim. (In the language of additive combinatorics, the point is that ${\phi}$ is a Freiman isomorphism of order (say) ${8}$ on ${[N]^d}$.)

Now let ${\tilde f: {\bf Z} \rightarrow [-1,1]}$ be the function defined by setting ${\tilde f( \phi(n) ) := f(n)}$ whenever ${n \in [N]^d}$, with ${\tilde f}$ vanishing outside of ${\phi([N]^d)}$. If ${f}$ obeys (1), then from the above Freiman isomorphism property we have

$\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n, h_1,\dots,h_{s+1} \in {\bf Z}} \prod_{\omega \in \{0,1\}^{s+1}} \tilde f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.$

Applying the one-dimensional inverse theorem (Theorem 1), with ${\delta}$ reduced by a factor of ${d 10^d}$ and ${N}$ replaced by ${d 10^d N^d}$, this implies the existence of a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta,d}(1)}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta,d}(1)}$ such that

$\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n \in {\bf Z}} \tilde f(n) F(g(n) \Gamma) \gg_{s,\delta,d} 1$

which by the Freiman isomorphism property again implies that

$\displaystyle \frac{1}{N^{d(s+2)}} \sum_{n \in {\bf Z}^d} f(n) F(g(\phi(n)) \Gamma) \gg_{s,\delta,d} 1.$

But the map ${n \mapsto g(\phi(n))}$ is clearly a polynomial map from ${{\bf Z}^d}$ to ${G}$ (the composition of two polynomial maps is polynomial, see e.g. Appendix B of my paper with Ben and Tammy), and the claim follows.

Remark 3 This trick appears to be largely restricted to the case of boundedly generated groups such as ${{\bf Z}^d}$; I do not see any easy way to deduce an inverse theorem for, say, ${\bigcup_{n=1}^\infty {\mathbb F}_p^n}$ from the ${{\bf Z}}$-inverse theorem by this method.

Remark 4 By combining this argument with the one in the previous post, one can obtain a weak ergodic inverse theorem for ${{\bf Z}^d}$-actions. Interestingly, the Freiman isomorphism argument appears to be difficult to implement directly in the ergodic category; in particular, there does not appear to be an obvious direct way to derive the Host-Kra inverse theorem for ${{\bf Z}^d}$ actions (a result first obtained in the PhD thesis of Griesmer) from the counterpart for ${{\bf Z}}$ actions.

Note: this post is of a particularly technical nature, in particular presuming familiarity with nilsequences, nilsystems, characteristic factors, etc., and is primarily intended for experts.

As mentioned in the previous post, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms) Let ${N \geq 1}$ and ${s \geq 1}$ be integers, and let ${\delta > 0}$. Suppose that ${f: {\bf Z} \rightarrow [-1,1]}$ is a function supported on ${[N] := \{1,\dots,N\}}$ such that

$\displaystyle \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.$

Then there exists a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta}(1)}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta}(1)}$ such that

$\displaystyle \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.$

This result was conjectured earlier by Ben Green and myself; this conjecture was strongly motivated by an analogous inverse theorem in ergodic theory by Host and Kra, which we formulate here in a form designed to resemble Theorem 1 as closely as possible:

Theorem 2 (Inverse theorem for Gowers-Host-Kra seminorms) Let ${s \geq 1}$ be an integer, and let ${(X, T)}$ be an ergodic, countably generated measure-preserving system. Suppose that one has

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N^{s+1}} \sum_{h_1,\dots,h_{s+1} \in [N]} \int_X \prod_{\omega \in \{0,1\}^{s+1}} f(T^{\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}}x)\ d\mu(x)$

$\displaystyle > 0$

for all non-zero ${f \in L^\infty(X)}$ (all ${L^p}$ spaces are real-valued in this post). Then ${(X,T)}$ is an inverse limit (in the category of measure-preserving systems, up to almost everywhere equivalence) of ergodic degree ${\leq s}$ nilsystems, that is to say systems of the form ${(G/\Gamma, x \mapsto gx)}$ for some degree ${\leq s}$ filtered nilmanifold ${G/\Gamma}$ and a group element ${g \in G}$ that acts ergodically on ${G/\Gamma}$.

It is a natural question to ask if there is any logical relationship between the two theorems. In the finite field category, one can deduce the combinatorial inverse theorem from the ergodic inverse theorem by a variant of the Furstenberg correspondence principle, as worked out by Tamar Ziegler and myself, however in the current context of ${{\bf Z}}$-actions, the connection is less clear.

One can split Theorem 2 into two components:

Theorem 3 (Weak inverse theorem for Gowers-Host-Kra seminorms) Let ${s \geq 1}$ be an integer, and let ${(X, T)}$ be an ergodic, countably generated measure-preserving system. Suppose that one has

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N^{s+1}} \sum_{h_1,\dots,h_{s+1} \in [N]} \int_X \prod_{\omega \in \{0,1\}^{s+1}} T^{\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}} f\ d\mu$

$\displaystyle > 0$

for all non-zero ${f \in L^\infty(X)}$, where ${T^h f := f \circ T^h}$. Then ${(X,T)}$ is a factor of an inverse limit of ergodic degree ${\leq s}$ nilsystems.

Theorem 4 (Pro-nilsystems closed under factors) Let ${s \geq 1}$ be an integer. Then any factor of an inverse limit of ergodic degree ${\leq s}$ nilsystems, is again an inverse limit of ergodic degree ${\leq s}$ nilsystems.

Indeed, it is clear that Theorem 2 implies both Theorem 3 and Theorem 4, and conversely that the two latter theorems jointly imply the former. Theorem 4 is, in principle, purely a fact about nilsystems, and should have an independent proof, but this is not known; the only known proofs go through the full machinery needed to prove Theorem 2 (or the closely related theorem of Ziegler). (However, the fact that a factor of a nilsystem is again a nilsystem was established previously by Parry.)

The purpose of this post is to record a partial implication in reverse direction to the correspondence principle:

Proposition 5 Theorem 1 implies Theorem 3.

As mentioned at the start of the post, a fair amount of familiarity with the area is presumed here, and some routine steps will be presented with only a fairly brief explanation.

A few years ago, Ben Green, Tamar Ziegler, and myself proved the following (rather technical-looking) inverse theorem for the Gowers norms:

Theorem 1 (Discrete inverse theorem for Gowers norms) Let ${N \geq 1}$ and ${s \geq 1}$ be integers, and let ${\delta > 0}$. Suppose that ${f: {\bf Z} \rightarrow [-1,1]}$ is a function supported on ${[N] := \{1,\dots,N\}}$ such that

$\displaystyle \frac{1}{N^{s+2}} \sum_{n,h_1,\dots,h_{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(n+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1}) \geq \delta.$

Then there exists a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta}(1)}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta}(1)}$ such that

$\displaystyle \frac{1}{N} \sum_n f(n) F(g(n) \Gamma) \gg_{s,\delta} 1.$

For the definitions of “filtered nilmanifold”, “degree”, “complexity”, and “polynomial sequence”, see the paper of Ben, Tammy, and myself. (I should caution the reader that this blog post will presume a fair amount of familiarity with this subfield of additive combinatorics.) This result has a number of applications, for instance to establishing asymptotics for linear equations in the primes, but this will not be the focus of discussion here.

The purpose of this post is to record the observation that this “discrete” inverse theorem, together with an equidistribution theorem for nilsequences that Ben and I worked out in a separate paper, implies a continuous version:

Theorem 2 (Continuous inverse theorem for Gowers norms) Let ${s \geq 1}$ be an integer, and let ${\delta>0}$. Suppose that ${f: {\bf R} \rightarrow [-1,1]}$ is a measurable function supported on ${[0,1]}$ such that

$\displaystyle \int_{{\bf R}^{s+1}} \prod_{\omega \in \{0,1\}^{s+1}} f(t+\omega_1 h_1 + \dots + \omega_{s+1} h_{s+1})\ dt dh_1 \dots dh_{s+1} \geq \delta. \ \ \ \ \ (1)$

Then there exists a filtered nilmanifold ${G/\Gamma}$ of degree ${\leq s}$ and complexity ${O_{s,\delta}(1)}$, a (smooth) polynomial sequence ${g: {\bf R} \rightarrow G}$, and a Lipschitz function ${F: G/\Gamma \rightarrow {\bf R}}$ of Lipschitz constant ${O_{s,\delta}(1)}$ such that

$\displaystyle \int_{\bf R} f(t) F(g(t) \Gamma)\ dt \gg_{s,\delta} 1.$

The interval ${[0,1]}$ can be easily replaced with any other fixed interval by a change of variables. A key point here is that the bounds are completely uniform in the choice of ${f}$. Note though that the coefficients of ${g}$ can be arbitrarily large (and this is necessary, as can be seen just by considering functions of the form ${f(t) = \cos( \xi t)}$ for some arbitrarily large frequency ${\xi}$).

It is likely that one could prove Theorem 2 by carefully going through the proof of Theorem 1 and replacing all instances of ${{\bf Z}}$ with ${{\bf R}}$ (and making appropriate modifications to the argument to accommodate this). However, the proof of Theorem 1 is quite lengthy. Here, we shall proceed by the usual limiting process of viewing the continuous interval ${[0,1]}$ as a limit of the discrete interval ${\frac{1}{N} \cdot [N]}$ as ${N \rightarrow \infty}$. However there will be some problems taking the limit due to a failure of compactness, and specifically with regards to the coefficients of the polynomial sequence ${g: {\bf N} \rightarrow G}$ produced by Theorem 1, after normalising these coefficients by ${N}$. Fortunately, a factorisation theorem from a paper of Ben Green and myself resolves this problem by splitting ${g}$ into a “smooth” part which does enjoy good compactness properties, as well as “totally equidistributed” and “periodic” parts which can be eliminated using the measurability (and thus, approximate smoothness), of ${f}$.

Szemerédi’s theorem asserts that any subset of the integers of positive upper density contains arbitrarily large arithmetic progressions. Here is an equivalent quantitative form of this theorem:

Theorem 1 (Szemerédi’s theorem) Let ${N}$ be a positive integer, and let ${f: {\bf Z}/N{\bf Z} \rightarrow [0,1]}$ be a function with ${{\bf E}_{x \in {\bf Z}/N{\bf Z}} f(x) \geq \delta}$ for some ${\delta>0}$, where we use the averaging notation ${{\bf E}_{x \in A} f(x) := \frac{1}{|A|} \sum_{x \in A} f(x)}$, ${{\bf E}_{x,r \in A} f(x) := \frac{1}{|A|^2} \sum_{x, r \in A} f(x)}$, etc.. Then for ${k \geq 3}$ we have

$\displaystyle {\bf E}_{x,r \in {\bf Z}/N{\bf Z}} f(x) f(x+r) \dots f(x+(k-1)r) \geq c(k,\delta)$

for some ${c(k,\delta)>0}$ depending only on ${k,\delta}$.

The equivalence is basically thanks to an averaging argument of Varnavides; see for instance Chapter 11 of my book with Van Vu or this previous blog post for a discussion. We have removed the cases ${k=1,2}$ as they are trivial and somewhat degenerate.

There are now many proofs of this theorem. Some time ago, I took an ergodic-theoretic proof of Furstenberg and converted it to a purely finitary proof of the theorem. The argument used some simplifying innovations that had been developed since the original work of Furstenberg (in particular, deployment of the Gowers uniformity norms, as well as a “dual” norm that I called the uniformly almost periodic norm, and an emphasis on van der Waerden’s theorem for handling the “compact extension” component of the argument). But the proof was still quite messy. However, as discussed in this previous blog post, messy finitary proofs can often be cleaned up using nonstandard analysis. Thus, there should be a nonstandard version of the Furstenberg ergodic theory argument that is relatively clean. I decided (after some encouragement from Ben Green and Isaac Goldbring) to write down most of the details of this argument in this blog post, though for sake of brevity I will skim rather quickly over arguments that were already discussed at length in other blog posts. In particular, I will presume familiarity with nonstandard analysis (in particular, the notion of a standard part of a bounded real number, and the Loeb measure construction), see for instance this previous blog post for a discussion.

The von Neumann ergodic theorem (the Hilbert space version of the mean ergodic theorem) asserts that if ${U: H \rightarrow H}$ is a unitary operator on a Hilbert space ${H}$, and ${v \in H}$ is a vector in that Hilbert space, then one has

$\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N U^n v = \pi_{H^U} v$

in the strong topology, where ${H^U := \{ w \in H: Uw = w \}}$ is the ${U}$-invariant subspace of ${H}$, and ${\pi_{H^U}}$ is the orthogonal projection to ${H^U}$. (See e.g. these previous lecture notes for a proof.) The same proof extends to more general amenable groups: if ${G}$ is a countable amenable group acting on a Hilbert space ${H}$ by unitary transformations ${T^g: H \rightarrow H}$ for ${g \in G}$, and ${v \in H}$ is a vector in that Hilbert space, then one has

$\displaystyle \lim_{N \rightarrow \infty} \mathop{\bf E}_{g \in \Phi_N} T^g v = \pi_{H^G} v \ \ \ \ \ (1)$

for any Folner sequence ${\Phi_N}$ of ${G}$, where ${H^G := \{ w \in H: T^g w = w \hbox{ for all }g \in G \}}$ is the ${G}$-invariant subspace, and ${\mathop{\bf E}_{a \in A} f(a) := \frac{1}{|A|} \sum_{a \in A} f(a)}$ is the average of ${f}$ on ${A}$. Thus one can interpret ${\pi_{H^G} v}$ as a certain average of elements of the orbit ${Gv := \{ T^g v: g \in G \}}$ of ${v}$.

In a previous blog post, I noted a variant of this ergodic theorem (due to Alaoglu and Birkhoff) that holds even when the group ${G}$ is not amenable (or not discrete), using a more abstract notion of averaging:

Theorem 1 (Abstract ergodic theorem) Let ${G}$ be an arbitrary group acting unitarily on a Hilbert space ${H}$, and let ${v}$ be a vector in ${H}$. Then ${\pi_{H^G} v}$ is the element in the closed convex hull of ${Gv := \{ T^g v: g \in G \}}$ of minimal norm, and is also the unique element of ${H^G}$ in this closed convex hull.

I recently stumbled upon a different way to think about this theorem, in the additive case ${G = (G,+)}$ when ${G}$ is abelian, which has a closer resemblance to the classical mean ergodic theorem. Given an arbitrary additive group ${G = (G,+)}$ (not necessarily discrete, or countable), let ${{\mathcal F}}$ denote the collection of finite non-empty multisets in ${G}$ – that is to say, unordered collections ${\{a_1,\dots,a_n\}}$ of elements ${a_1,\dots,a_n}$ of ${G}$, not necessarily distinct, for some positive integer ${n}$. Given two multisets ${A = \{a_1,\dots,a_n\}}$, ${B = \{b_1,\dots,b_m\}}$ in ${{\mathcal F}}$, we can form the sum set ${A + B := \{ a_i + b_j: 1 \leq i \leq n, 1 \leq j \leq m \}}$. Note that the sum set ${A+B}$ can contain multiplicity even when ${A, B}$ do not; for instance, ${\{ 1,2\} + \{1,2\} = \{2,3,3,4\}}$. Given a multiset ${A = \{a_1,\dots,a_n\}}$ in ${{\mathcal F}}$, and a function ${f: G \rightarrow H}$ from ${G}$ to a vector space ${H}$, we define the average ${\mathop{\bf E}_{a \in A} f(a)}$ as

$\displaystyle \mathop{\bf E}_{a \in A} f(a) = \frac{1}{n} \sum_{j=1}^n f(a_j).$

Note that the multiplicity function of the set ${A}$ affects the average; for instance, we have ${\mathop{\bf E}_{a \in \{1,2\}} a = \frac{3}{2}}$, but ${\mathop{\bf E}_{a \in \{1,2,2\}} a = \frac{5}{3}}$.

We can define a directed set on ${{\mathcal F}}$ as follows: given two multisets ${A,B \in {\mathcal F}}$, we write ${A \geq B}$ if we have ${A = B+C}$ for some ${C \in {\mathcal F}}$. Thus for instance we have ${\{ 1, 2, 2, 3\} \geq \{1,2\}}$. It is easy to verify that this operation is transitive and reflexive, and is directed because any two elements ${A,B}$ of ${{\mathcal F}}$ have a common upper bound, namely ${A+B}$. (This is where we need ${G}$ to be abelian.) The notion of convergence along a net, now allows us to define the notion of convergence along ${{\mathcal F}}$; given a family ${x_A}$ of points in a topological space ${X}$ indexed by elements ${A}$ of ${{\mathcal F}}$, and a point ${x}$ in ${X}$, we say that ${x_A}$ converges to ${x}$ along ${{\mathcal F}}$ if, for every open neighbourhood ${U}$ of ${x}$ in ${X}$, one has ${x_A \in U}$ for sufficiently large ${A}$, that is to say there exists ${B \in {\mathcal F}}$ such that ${x_A \in U}$ for all ${A \geq B}$. If the topological space ${V}$ is Hausdorff, then the limit ${x}$ is unique (if it exists), and we then write

$\displaystyle x = \lim_{A \rightarrow G} x_A.$

When ${x_A}$ takes values in the reals, one can also define the limit superior or limit inferior along such nets in the obvious fashion.

We can then give an alternate formulation of the abstract ergodic theorem in the abelian case:

Theorem 2 (Abelian abstract ergodic theorem) Let ${G = (G,+)}$ be an arbitrary additive group acting unitarily on a Hilbert space ${H}$, and let ${v}$ be a vector in ${H}$. Then we have

$\displaystyle \pi_{H^G} v = \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v$

in the strong topology of ${H}$.

Proof: Suppose that ${A \geq B}$, so that ${A=B+C}$ for some ${C \in {\mathcal F}}$, then

$\displaystyle \mathop{\bf E}_{a \in A} T^a v = \mathop{\bf E}_{c \in C} T^c ( \mathop{\bf E}_{b \in B} T^b v )$

so by unitarity and the triangle inequality we have

$\displaystyle \| \mathop{\bf E}_{a \in A} T^a v \|_H \leq \| \mathop{\bf E}_{b \in B} T^b v \|_H,$

thus ${\| \mathop{\bf E}_{a \in A} T^a v \|_H^2}$ is monotone non-increasing in ${A}$. Since this quantity is bounded between ${0}$ and ${\|v\|_H}$, we conclude that the limit ${\lim_{A \rightarrow G} \| \mathop{\bf E}_{a \in A} T^a v \|_H^2}$ exists. Thus, for any ${\varepsilon > 0}$, we have for sufficiently large ${A}$ that

$\displaystyle \| \mathop{\bf E}_{b \in B} T^b v \|_H^2 \geq \| \mathop{\bf E}_{a \in A} T^a v \|_H^2 - \varepsilon$

for all ${B \geq A}$. In particular, for any ${g \in G}$, we have

$\displaystyle \| \mathop{\bf E}_{b \in A + \{0,g\}} T^b v \|_H^2 \geq \| \mathop{\bf E}_{a \in A} T^a v \|_H^2 - \varepsilon.$

We can write

$\displaystyle \mathop{\bf E}_{b \in A + \{0,g\}} T^b v = \frac{1}{2} \mathop{\bf E}_{a \in A} T^a v + \frac{1}{2} T^g \mathop{\bf E}_{a \in A} T^a v$

and so from the parallelogram law and unitarity we have

$\displaystyle \| \mathop{\bf E}_{a \in A} T^a v - T^g \mathop{\bf E}_{a \in A} T^a v \|_H^2 \leq 4 \varepsilon$

for all ${g \in G}$, and hence by the triangle inequality (averaging ${g}$ over a finite multiset ${C}$)

$\displaystyle \| \mathop{\bf E}_{a \in A} T^a v - \mathop{\bf E}_{b \in A+C} T^b v \|_H^2 \leq 4 \varepsilon$

for any ${C \in {\mathcal F}}$. This shows that ${\mathop{\bf E}_{a \in A} T^a v}$ is a Cauchy sequence in ${H}$ (in the strong topology), and hence (by the completeness of ${H}$) tends to a limit. Shifting ${A}$ by a group element ${g}$, we have

$\displaystyle \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v = \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A + \{g\}} T^a v = T^g \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v$

and hence ${\lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v}$ is invariant under shifts, and thus lies in ${H^G}$. On the other hand, for any ${w \in H^G}$ and ${A \in {\mathcal F}}$, we have

$\displaystyle \langle \mathop{\bf E}_{a \in A} T^a v, w \rangle_H = \mathop{\bf E}_{a \in A} \langle v, T^{-a} w \rangle_H = \langle v, w \rangle_H$

and thus on taking strong limits

$\displaystyle \langle \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v, w \rangle_H = \langle v, w \rangle_H$

and so ${v - \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v}$ is orthogonal to ${H^G}$. Combining these two facts we see that ${\lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^a v}$ is equal to ${\pi_{H^G} v}$ as claimed. $\Box$

To relate this result to the classical ergodic theorem, we observe

Lemma 3 Let ${G}$ be a countable additive group, with a F{\o}lner sequence ${\Phi_n}$, and let ${f_g}$ be a bounded sequence in a normed vector space indexed by ${G}$. If ${\lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} f_a}$ exists, then ${\lim_{n \rightarrow \infty} \mathop{\bf E}_{a \in \Phi_n} f_a}$ exists, and the two limits are equal.

Proof: From the F{\o}lner property, we see that for any ${A}$ and any ${\varepsilon>0}$, the averages ${\mathop{\bf E}_{a \in \Phi_n} f_a}$ and ${\mathop{\bf E}_{a \in A+\Phi_n} f_a}$ differ by at most ${\varepsilon}$ in norm if ${n}$ is sufficiently large depending on ${A}$, ${\varepsilon}$ (and the ${f_a}$). On the other hand, by the existence of the limit ${\lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} f_a}$, the averages ${\mathop{\bf E}_{a \in A} f_a}$ and ${\mathop{\bf E}_{a \in A + \Phi_n} f_a}$ differ by at most ${\varepsilon}$ in norm if ${A}$ is sufficiently large depending on ${\varepsilon}$ (regardless of how large ${n}$ is). The claim follows. $\Box$

It turns out that this approach can also be used as an alternate way to construct the GowersHost-Kra seminorms in ergodic theory, which has the feature that it does not explicitly require any amenability on the group ${G}$ (or separability on the underlying measure space), though, as pointed out to me in comments, even uncountable abelian groups are amenable in the sense of possessing an invariant mean, even if they do not have a F{\o}lner sequence.

Given an arbitrary additive group ${G}$, define a ${G}$-system ${({\mathrm X}, T)}$ to be a probability space ${{\mathrm X} = (X, {\mathcal X}, \mu)}$ (not necessarily separable or standard Borel), together with a collection ${T^g: X \rightarrow X}$ of invertible, measure-preserving maps, such that ${T^0}$ is the identity and ${T^g T^h = T^{g+h}}$ (modulo null sets) for all ${g,h \in G}$. This then gives isomorphisms ${T^g: L^p({\mathrm X}) \rightarrow L^p({\mathrm X})}$ for ${1 \leq p \leq \infty}$ by setting ${T^g f(x) := f(T^{-g} x)}$. From the above abstract ergodic theorem, we see that

$\displaystyle {\mathbf E}( f | {\mathcal X}^G ) = \lim_{A \rightarrow G} \mathop{\bf E}_{a \in A} T^g f$

in the strong topology of ${L^2({\mathrm X})}$ for any ${f \in L^2({\mathrm X})}$, where ${{\mathcal X}^G}$ is the collection of measurable sets ${E}$ that are essentially ${G}$-invariant in the sense that ${T^g E = E}$ modulo null sets for all ${g \in G}$, and ${{\mathbf E}(f|{\mathcal X}^G)}$ is the conditional expectation of ${f}$ with respect to ${{\mathcal X}^G}$.

In a similar spirit, we have

Theorem 4 (Convergence of Gowers-Host-Kra seminorms) Let ${({\mathrm X},T)}$ be a ${G}$-system for some additive group ${G}$. Let ${d}$ be a natural number, and for every ${\omega \in\{0,1\}^d}$, let ${f_\omega \in L^{2^d}({\mathrm X})}$, which for simplicity we take to be real-valued. Then the expression

$\displaystyle \langle (f_\omega)_{\omega \in \{0,1\}^d} \rangle_{U^d({\mathrm X})} := \lim_{A_1,\dots,A_d \rightarrow G}$

$\displaystyle \mathop{\bf E}_{h_1 \in A_1-A_1,\dots,h_d \in A_d-A_d} \int_X \prod_{\omega \in \{0,1\}^d} T^{\omega_1 h_1 + \dots + \omega_d h_d} f_\omega\ d\mu$

converges, where we write ${\omega = (\omega_1,\dots,\omega_d)}$, and we are using the product direct set on ${{\mathcal F}^d}$ to define the convergence ${A_1,\dots,A_d \rightarrow G}$. In particular, for ${f \in L^{2^d}({\mathrm X})}$, the limit

$\displaystyle \| f \|_{U^d({\mathrm X})}^{2^d} = \lim_{A_1,\dots,A_d \rightarrow G}$

$\displaystyle \mathop{\bf E}_{h_1 \in A_1-A_1,\dots,h_d \in A_d-A_d} \int_X \prod_{\omega \in \{0,1\}^d} T^{\omega_1 h_1 + \dots + \omega_d h_d} f\ d\mu$

converges.

We prove this theorem below the fold. It implies a number of other known descriptions of the Gowers-Host-Kra seminorms ${\|f\|_{U^d({\mathrm X})}}$, for instance that

$\displaystyle \| f \|_{U^d({\mathrm X})}^{2^d} = \lim_{A \rightarrow G} \mathop{\bf E}_{h \in A-A} \| f T^h f \|_{U^{d-1}({\mathrm X})}^{2^{d-1}}$

for ${d > 1}$, while from the ergodic theorem we have

$\displaystyle \| f \|_{U^1({\mathrm X})} = \| {\mathbf E}( f | {\mathcal X}^G ) \|_{L^2({\mathrm X})}.$

This definition also manifestly demonstrates the cube symmetries of the Host-Kra measures ${\mu^{[d]}}$ on ${X^{\{0,1\}^d}}$, defined via duality by requiring that

$\displaystyle \langle (f_\omega)_{\omega \in \{0,1\}^d} \rangle_{U^d({\mathrm X})} = \int_{X^{\{0,1\}^d}} \bigotimes_{\omega \in \{0,1\}^d} f_\omega\ d\mu^{[d]}.$

In a subsequent blog post I hope to present a more detailed study of the ${U^2}$ norm and its relationship with eigenfunctions and the Kronecker factor, without assuming any amenability on ${G}$ or any separability or topological structure on ${{\mathrm X}}$.

I’ve just finished writing the first draft of my third book coming out of the 2010 blog posts, namely “Higher order Fourier analysis“, which was based primarily on my graduate course in the topic, though it also contains material from some additional posts related to linear and higher order Fourier analysis on the blog.  It is available online here.  As usual, comments and corrections are welcome.  There is also a stub page for the book, which at present does not contain much more than the above link.

﻿

Tanja Eisner and I have just uploaded to the arXiv our paper “Large values of the Gowers-Host-Kra seminorms“, submitted to Journal d’Analyse Mathematique. This paper is concerned with the properties of three closely related families of (semi)norms, indexed by a positive integer ${k}$:

• The Gowers uniformity norms ${\|f\|_{U^k(G)}}$ of a (bounded, measurable, compactly supported) function ${f: G \rightarrow {\bf C}}$ taking values on a locally compact abelian group ${G}$, equipped with a Haar measure ${\mu}$;
• The Gowers uniformity norms ${\|f\|_{U^k([N])}}$ of a function ${f: [N] \rightarrow {\bf C}}$ on a discrete interval ${\{1,\ldots,N\}}$; and
• The Gowers-Host-Kra seminorms ${\|f\|_{U^k(X)}}$ of a function ${f \in L^\infty(X)}$ on an ergodic measure-preserving system ${X = (X,{\mathcal X},\mu,T)}$.

These norms have been discussed in depth in previous blog posts, so I will just quickly review the definition of the first norm here (the other two (semi)norms are defined similarly). The ${U^k(G)}$ norm is defined recursively by setting

$\displaystyle \| f \|_{U^1(G)} := |\int_G f\ d\mu|$

and

$\displaystyle \|f\|_{U^k(G)}^{2^k} := \int_G \| \Delta_h f \|_{U^{k-1}(G)}^{2^{k-1}}\ d\mu(h)$

where ${\Delta_h f(x) := f(x+h) \overline{f(x)}}$. Equivalently, one has

$\displaystyle \|f\|_{U^k(G)} := (\int_G \ldots \int_G \Delta_{h_1} \ldots \Delta_{h_k} f(x)\ d\mu(x) d\mu(h_1) \ldots d\mu(h_k))^{1/2^k}.$

Informally, the Gowers uniformity norm ${\|f\|_{U^k(G)}}$ measures the extent to which (the phase of ${f}$) behaves like a polynomial of degree less than ${k}$. Indeed, if ${\|f\|_{L^\infty(G)} \leq 1}$ and ${G}$ is compact with normalised Haar measure ${\mu(G)=1}$, it is not difficult to show that ${\|f\|_{U^k(G)}}$ is at most ${1}$, with equality if and only if ${f}$ takes the form ${f = e(P) := e^{2\pi iP}}$ almost everywhere, where ${P: G \rightarrow {\bf R}/{\bf Z}}$ is a polynomial of degree less than ${k}$ (which means that ${\partial_{h_1} \ldots \partial_{h_k} P(x) = 0}$ for all ${x,h_1,\ldots,h_k \in G}$).

Our first result is to show that this result is robust, uniformly over all choices of group ${G}$:

Theorem 1 (${L^\infty}$-near extremisers) Let ${G}$ be a compact abelian group with normalised Haar measure ${\mu(G)=1}$, and let ${f \in L^\infty(G)}$ be such that ${\|f\|_{L^\infty(G)} \leq 1}$ and ${\|f\|_{U^k(G)} \geq 1-\epsilon}$ for some ${\epsilon > 0}$ and ${k \geq 1}$. Then there exists a polynomial ${P: G \rightarrow {\bf R}/{\bf Z}}$ of degree at most ${k-1}$ such that ${\|f-e(P)\|_{L^1(G)} = o(1)}$, where ${o(1)}$ is bounded by a quantity ${c_k(\epsilon)}$ that goes to zero as ${\epsilon \rightarrow 0}$ for fixed ${k}$.

The quantity ${o(1)}$ can be described effectively (it is of polynomial size in ${\epsilon}$), but we did not seek to optimise it here. This result was already known in the case of vector spaces ${G = {\bf F}_p^n}$ over a fixed finite field ${{\bf F}_p}$ (where it is essentially equivalent to the assertion that the property of being a polynomial of degree at most ${k-1}$ is locally testable); the extension to general groups ${G}$ turns out to fairly routine. The basic idea is to use the recursive structure of the Gowers norms, which tells us in particular that if ${\|f\|_{U^k(G)}}$ is close to one, then ${\|\Delta_h f\|_{U^{k-1}(G)}}$ is close to one for most ${h}$, which by induction implies that ${\Delta_h f}$ is close to ${e(Q_h)}$ for some polynomials ${Q_h}$ of degree at most ${k-2}$ and for most ${h}$. (Actually, it is not difficult to use cocycle equations such as ${\Delta_{h+k} f = \Delta_h f \times T^h \Delta_k f}$ (when ${|f|=1}$) to upgrade “for most ${h}$” to “for all ${h}$“.) To finish the job, one would like to express the ${Q_h}$ as derivatives ${Q_h = \partial_h P}$ of a polynomial ${P}$ of degree at most ${k-1}$. This turns out to be equivalent to requiring that the ${Q_h}$ obey the cocycle equation

$\displaystyle Q_{h+k} = Q_h + T^h Q_k$

where ${T^h F(x) := F(x+h)}$ is the translate of ${F}$ by ${h}$. (In the paper, the sign conventions are reversed, so that ${T^h F(x) := F(x-h)}$, in order to be compatible with ergodic theory notation, but this makes no substantial difference to the arguments or results.) However, one does not quite get this right away; instead, by using some separation properties of polynomials, one can show the weaker statement that

$\displaystyle Q_{h+k} = Q_h + T^h Q_k + c_{h,k} \ \ \ \ \ (1)$

where the ${c_{h,k}}$ are small real constants. To eliminate these constants, one exploits the trivial cohomology of the real line. From (1) one soon concludes that the ${c_{h,k}}$ obey the ${2}$-cocycle equation

$\displaystyle c_{h,k} + c_{h+k,l} = c_{h,k+l} + c_{k,l}$

and an averaging argument then shows that ${c_{h,k}}$ is a ${2}$-coboundary in the sense that

$\displaystyle c_{h,k} = b_{h+k} - b_h - b_k$

for some small scalar ${b_h}$ depending on ${h}$. Subtracting ${b_h}$ from ${Q_h}$ then gives the claim.

Similar results and arguments also hold for the ${U^k([N])}$ and ${U^k(X)}$ norms, which we will not detail here.

Dimensional analysis reveals that the ${L^\infty}$ norm is not actually the most natural norm with which to compare the ${U^k}$ norms against. An application of Young’s convolution inequality in fact reveals that one has the inequality

$\displaystyle \|f\|_{U^k(G)} \leq \|f\|_{L^{p_k}(G)} \ \ \ \ \ (2)$

where ${p_k}$ is the critical exponent ${p_k := 2^k/(k+1)}$, without any compactness or normalisation hypothesis on the group ${G}$ and the Haar measure ${\mu}$. This allows us to extend the ${U^k(G)}$ norm to all of ${L^{p_k}(G)}$. There is then a stronger inverse theorem available:

Theorem 2 (${L^{p_k}}$-near extremisers) Let ${G}$ be a locally compact abelian group, and let ${f \in L^{p_k}(G)}$ be such that ${\|f\|_{L^{p_k}(G)} \leq 1}$ and ${\|f\|_{U^k(G)} \geq 1-\epsilon}$ for some ${\epsilon > 0}$ and ${k \geq 1}$. Then there exists a coset ${H}$ of a compact open subgroup ${H}$ of ${G}$, and a polynomial ${P: H to {\bf R}/{\bf Z}}$ of degree at most ${k-1}$ such that ${\|f-e(P) 1_H\|_{L^{p_k}(G)} = o(1)}$.

Conversely, it is not difficult to show that equality in (2) is attained when ${f}$ takes the form ${e(P) 1_H}$ as above. The main idea of proof is to use an inverse theorem for Young’s inequality due to Fournier to reduce matters to the ${L^\infty}$ case that was already established. An analogous result is also obtained for the ${U^k(X)}$ norm on an ergodic system; but for technical reasons, the methods do not seem to apply easily to the ${U^k([N])}$ norm. (This norm is essentially equivalent to the ${U^k({\bf Z}/\tilde N{\bf Z})}$ norm up to constants, with ${\tilde N}$ comparable to ${N}$, but when working with near-extremisers, norms that are only equivalent up to constants can have quite different near-extremal behaviour.)

In the case when ${G}$ is a Euclidean group ${{\bf R}^d}$, it is possible to use the sharp Young inequality of Beckner and of Brascamp-Lieb to improve (2) somewhat. For instance, when ${k=3}$, one has

$\displaystyle \|f\|_{U^3({\bf R}^d)} \leq 2^{-d/8} \|f\|_{L^2({\bf R}^d)}$

with equality attained if and only if ${f}$ is a gaussian modulated by a quadratic polynomial phase. This additional gain of ${2^{-d/8}}$ allows one to pinpoint the threshold ${1-\epsilon}$ for the previous near-extremiser results in the case of ${U^3}$ norms. For instance, by using the Host-Kra machinery of characteristic factors for the ${U^3(X)}$ norm, combined with an explicit and concrete analysis of the ${2}$-step nilsystems generated by that machinery, we can show that

$\displaystyle \|f\|_{U^3(X)} \leq 2^{-1/8} \|f\|_{L^2(X)}$

whenever ${X}$ is a totally ergodic system and ${f}$ is orthogonal to all linear and quadratic eigenfunctions (which would otherwise form immediate counterexamples to the above inequality), with the factor ${2^{-1/8}}$ being best possible. We can also establish analogous results for the ${U^3([N])}$ and ${U^3({\bf Z}/N{\bf Z})}$ norms (using the inverse ${U^3}$ theorem of Ben Green and myself, in place of the Host-Kra machinery), although it is not clear to us whether the ${2^{-1/8}}$ threshold remains best possible in this case.

In Notes 5, we saw that the Gowers uniformity norms on vector spaces ${{\bf F}^n}$ in high characteristic were controlled by classical polynomial phases ${e(\phi)}$.

Now we study the analogous situation on cyclic groups ${{\bf Z}/N{\bf Z}}$. Here, there is an unexpected surprise: the polynomial phases (classical or otherwise) are no longer sufficient to control the Gowers norms ${U^{s+1}({\bf Z}/N{\bf Z})}$ once ${s}$ exceeds ${1}$. To resolve this problem, one must enlarge the space of polynomials to a larger class. It turns out that there are at least three closely related options for this class: the local polynomials, the bracket polynomials, and the nilsequences. Each of the three classes has its own strengths and weaknesses, but in my opinion the nilsequences seem to be the most natural class, due to the rich algebraic and dynamical structure coming from the nilpotent Lie group undergirding such sequences. For reasons of space we shall focus primarily on the nilsequence viewpoint here.

Traditionally, nilsequences have been defined in terms of linear orbits ${n \mapsto g^n x}$ on nilmanifolds ${G/\Gamma}$; however, in recent years it has been realised that it is convenient for technical reasons (particularly for the quantitative “single-scale” theory) to generalise this setup to that of polynomial orbits ${n \mapsto g(n) \Gamma}$, and this is the perspective we will take here.

A polynomial phase ${n \mapsto e(\phi(n))}$ on a finite abelian group ${H}$ is formed by starting with a polynomial ${\phi: H \rightarrow {\bf R}/{\bf Z}}$ to the unit circle, and then composing it with the exponential function ${e: {\bf R}/{\bf Z} \rightarrow {\bf C}}$. To create a nilsequence ${n \mapsto F(g(n) \Gamma)}$, we generalise this construction by starting with a polynomial ${g \Gamma: H \rightarrow G/\Gamma}$ into a nilmanifold ${G/\Gamma}$, and then composing this with a Lipschitz function ${F: G/\Gamma \rightarrow {\bf C}}$. (The Lipschitz regularity class is convenient for minor technical reasons, but one could also use other regularity classes here if desired.) These classes of sequences certainly include the polynomial phases, but are somewhat more general; for instance, they almost include bracket polynomial phases such as ${n \mapsto e( \lfloor \alpha n \rfloor \beta n )}$. (The “almost” here is because the relevant functions ${F: G/\Gamma \rightarrow {\bf C}}$ involved are only piecewise Lipschitz rather than Lipschitz, but this is primarily a technical issue and one should view bracket polynomial phases as “morally” being nilsequences.)

In these notes we set out the basic theory for these nilsequences, including their equidistribution theory (which generalises the equidistribution theory of polynomial flows on tori from Notes 1) and show that they are indeed obstructions to the Gowers norm being small. This leads to the inverse conjecture for the Gowers norms that shows that the Gowers norms on cyclic groups are indeed controlled by these sequences.

In Notes 3, we saw that the number of additive patterns in a given set was (in principle, at least) controlled by the Gowers uniformity norms of functions associated to that set.

Such norms can be defined on any finite additive group (and also on some other types of domains, though we will not discuss this point here). In particular, they can be defined on the finite-dimensional vector spaces ${V}$ over a finite field ${{\bf F}}$.

In this case, the Gowers norms ${U^{d+1}(V)}$ are closely tied to the space ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ of polynomials of degree at most ${d}$. Indeed, as noted in Exercise 20 of Notes 4, a function ${f: V \rightarrow {\bf C}}$ of ${L^\infty(V)}$ norm ${1}$ has ${U^{d+1}(V)}$ norm equal to ${1}$ if and only if ${f = e(\phi)}$ for some ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$; thus polynomials solve the “${100\%}$ inverse problem” for the trivial inequality ${\|f\|_{U^{d+1}(V)} \leq \|f\|_{L^\infty(V)}}$. They are also a crucial component of the solution to the “${99\%}$ inverse problem” and “${1\%}$ inverse problem”. For the former, we will soon show:

Proposition 1 (${99\%}$ inverse theorem for ${U^{d+1}(V)}$) Let ${f: V \rightarrow {\bf C}}$ be such that ${\|f\|_{L^\infty(V)}}$ and ${\|f\|_{U^{d+1}(V)} \geq 1-\epsilon}$ for some ${\epsilon > 0}$. Then there exists ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ such that ${\| f - e(\phi)\|_{L^1(V)} = O_{d, {\bf F}}( \epsilon^c )}$, where ${c = c_d > 0}$ is a constant depending only on ${d}$.

Thus, for the Gowers norm to be almost completely saturated, one must be very close to a polynomial. The converse assertion is easily established:

Exercise 1 (Converse to ${99\%}$ inverse theorem for ${U^{d+1}(V)}$) If ${\|f\|_{L^\infty(V)} \leq 1}$ and ${\|f-e(\phi)\|_{L^1(V)} \leq \epsilon}$ for some ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$, then ${\|F\|_{U^{d+1}(V)} \geq 1 - O_{d,{\bf F}}( \epsilon^c )}$, where ${c = c_d > 0}$ is a constant depending only on ${d}$.

In the ${1\%}$ world, one no longer expects to be close to a polynomial. Instead, one expects to correlate with a polynomial. Indeed, one has

Lemma 2 (Converse to the ${1\%}$ inverse theorem for ${U^{d+1}(V)}$) If ${f: V \rightarrow {\bf C}}$ and ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ are such that ${|\langle f, e(\phi) \rangle_{L^2(V)}| \geq \epsilon}$, where ${\langle f, g \rangle_{L^2(V)} := {\bf E}_{x \in G} f(x) \overline{g(x)}}$, then ${\|f\|_{U^{d+1}(V)} \geq \epsilon}$.

Proof: From the definition of the ${U^1}$ norm (equation (18) from Notes 3), the monotonicity of the Gowers norms (Exercise 19 of Notes 3), and the polynomial phase modulation invariance of the Gowers norms (Exercise 21 of Notes 3), one has

$\displaystyle |\langle f, e(\phi) \rangle| = \| f e(-\phi) \|_{U^1(V)}$

$\displaystyle \leq \|f e(-\phi) \|_{U^{d+1}(V)}$

$\displaystyle = \|f\|_{U^{d+1}(V)}$

and the claim follows. $\Box$

In the high characteristic case ${\hbox{char}({\bf F}) > d}$ at least, this can be reversed:

Theorem 3 (${1\%}$ inverse theorem for ${U^{d+1}(V)}$) Suppose that ${\hbox{char}({\bf F}) > d \geq 0}$. If ${f: V \rightarrow {\bf C}}$ is such that ${\|f\|_{L^\infty(V)} \leq 1}$ and ${\|f\|_{U^{d+1}(V)} \geq \epsilon}$, then there exists ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ such that ${|\langle f, e(\phi) \rangle_{L^2(V)}| \gg_{\epsilon,d,{\bf F}} 1}$.

This result is sometimes referred to as the inverse conjecture for the Gowers norm (in high, but bounded, characteristic). For small ${d}$, the claim is easy:

Exercise 2 Verify the cases ${d=0,1}$ of this theorem. (Hint: to verify the ${d=1}$ case, use the Fourier-analytic identities ${\|f\|_{U^2(V)} = (\sum_{\xi \in \hat V} |\hat f(\xi)|^4)^{1/4}}$ and ${\|f\|_{L^2(V)} = (\sum_{\xi \in \hat V} |\hat f(\xi)|^2)^{1/2}}$, where ${\hat V}$ is the space of all homomorphisms ${\xi: x \mapsto \xi \cdot x}$ from ${V}$ to ${{\bf R}/{\bf Z}}$, and ${\hat f(\xi) := \mathop{\bf E}_{x \in V} f(x) e(-\xi \cdot x)}$ are the Fourier coefficients of ${f}$.)

This conjecture for larger values of ${d}$ are more difficult to establish. The ${d=2}$ case of the theorem was established by Ben Green and myself in the high characteristic case ${\hbox{char}({\bf F}) > 2}$; the low characteristic case ${\hbox{char}({\bf F}) = d = 2}$ was independently and simultaneously established by Samorodnitsky. The cases ${d>2}$ in the high characteristic case was established in two stages, firstly using a modification of the Furstenberg correspondence principle, due to Ziegler and myself. to convert the problem to an ergodic theory counterpart, and then using a modification of the methods of Host-Kra and Ziegler to solve that counterpart, as done in this paper of Bergelson, Ziegler, and myself.

The situation with the low characteristic case in general is still unclear. In the high characteristic case, we saw from Notes 4 that one could replace the space of non-classical polynomials ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ in the above conjecture with the essentially equivalent space of classical polynomials ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf F})}$. However, as we shall see below, this turns out not to be the case in certain low characteristic cases (a fact first observed by Lovett, Meshulam, and Samorodnitsky, and independently by Ben Green and myself), for instance if ${\hbox{char}({\bf F}) = 2}$ and ${d \geq 3}$; this is ultimately due to the existence in those cases of non-classical polynomials which exhibit no significant correlation with classical polynomials of equal or lesser degree. This distinction between classical and non-classical polynomials appears to be a rather non-trivial obstruction to understanding the low characteristic setting; it may be necessary to obtain a more complete theory of non-classical polynomials in order to fully settle this issue.

The inverse conjecture has a number of consequences. For instance, it can be used to establish the analogue of Szemerédi’s theorem in this setting:

Theorem 4 (Szemerédi’s theorem for finite fields) Let ${{\bf F} = {\bf F}_p}$ be a finite field, let ${\delta > 0}$, and let ${A \subset {\bf F}^n}$ be such that ${|A| \geq \delta |{\bf F}^n|}$. If ${n}$ is sufficiently large depending on ${p,\delta}$, then ${A}$ contains an (affine) line ${\{ x, x+r, \ldots, x+(p-1)r\}}$ for some ${x,r \in {\bf F}^n}$ with ${ r\neq 0}$.

Exercise 3 Use Theorem 4 to establish the following generalisation: with the notation as above, if ${k \geq 1}$ and ${n}$ is sufficiently large depending on ${p,\delta}$, then ${A}$ contains an affine ${k}$-dimensional subspace.

We will prove this theorem in two different ways, one using a density increment method, and the other using an energy increment method. We discuss some other applications below the fold.

A (complex, semi-definite) inner product space is a complex vector space ${V}$ equipped with a sesquilinear form ${\langle, \rangle: V \times V \rightarrow {\bf C}}$ which is conjugate symmetric, in the sense that ${\langle w, v \rangle = \overline{\langle v, w \rangle}}$ for all ${v,w \in V}$, and non-negative in the sense that ${\langle v, v \rangle \geq 0}$ for all ${v \in V}$. By inspecting the non-negativity of ${\langle v+\lambda w, v+\lambda w\rangle}$ for complex numbers ${\lambda \in {\bf C}}$, one obtains the Cauchy-Schwarz inequality

$\displaystyle |\langle v, w \rangle| \leq |\langle v, v \rangle|^{1/2} |\langle w, w \rangle|^{1/2};$

if one then defines ${\|v\| := |\langle v, v \rangle|^{1/2}}$, one then quickly concludes the triangle inequality

$\displaystyle \|v + w \| \leq \|v\| + \|w\|$

which then soon implies that ${\| \|}$ is a semi-norm on ${V}$. If we make the additional assumption that the inner product ${\langle,\rangle}$ is positive definite, i.e. that ${\langle v, v \rangle > 0}$ whenever ${v}$ is non-zero, then this semi-norm becomes a norm. If ${V}$ is complete with respect to the metric ${d(v,w) := \|v-w\|}$ induced by this norm, then ${V}$ is called a Hilbert space.

The above material is extremely standard, and can be found in any graduate real analysis course; I myself covered it here. But what is perhaps less well known (except inside the fields of additive combinatorics and ergodic theory) is that the above theory of classical Hilbert spaces is just the first case of a hierarchy of higher order Hilbert spaces, in which the binary inner product ${f, g \mapsto \langle f, g \rangle}$ is replaced with a ${2^d}$-ary inner product ${(f_\omega)_{\omega \in \{0,1\}^d} \mapsto \langle (f_\omega)_{\omega \in \{0,1\}^d}}$ that obeys an appropriate generalisation of the conjugate symmetry, sesquilinearity, and positive semi-definiteness axioms. Such inner products then obey a higher order Cauchy-Schwarz inequality, known as the Cauchy-Schwarz-Gowers inequality, and then also obey a triangle inequality and become semi-norms (or norms, if the inner product was non-degenerate). Examples of such norms and spaces include the Gowers uniformity norms ${\| \|_{U^d(G)}}$, the Gowers box norms ${\| \|_{\Box^d(X_1 \times \ldots \times X_d)}}$, and the Gowers-Host-Kra seminorms ${\| \|_{U^d(X)}}$; a more elementary example are the family of Lebesgue spaces ${L^{2^d}(X)}$ when the exponent is a power of two. They play a central role in modern additive combinatorics and to certain aspects of ergodic theory, particularly those relating to Szemerédi’s theorem (or its ergodic counterpart, the Furstenberg multiple recurrence theorem); they also arise in the regularity theory of hypergraphs (which is not unrelated to the other two topics).

A simple example to keep in mind here is the order two Hilbert space ${L^4(X)}$ on a measure space ${X = (X,{\mathcal B},\mu)}$, where the inner product takes the form

$\displaystyle \langle f_{00}, f_{01}, f_{10}, f_{11} \rangle_{L^4(X)} := \int_X f_{00}(x) \overline{f_{01}(x)} \overline{f_{10}(x)} f_{11}(x)\ d\mu(x).$

In this brief note I would like to set out the abstract theory of such higher order Hilbert spaces. This is not new material, being already implicit in the breakthrough papers of Gowers and Host-Kra, but I just wanted to emphasise the fact that the material is abstract, and is not particularly tied to any explicit choice of norm so long as a certain axiom are satisfied. (Also, I wanted to write things down so that I would not have to reconstruct this formalism again in the future.) Unfortunately, the notation is quite heavy and the abstract axiom is a little strange; it may be that there is a better way to formulate things. In this particular case it does seem that a concrete approach is significantly clearer, but abstraction is at least possible.

Note: the discussion below is likely to be comprehensible only to readers who already have some exposure to the Gowers norms.