You are currently browsing the tag archive for the ‘Tamar Ziegler’ tag.

One of the basic problems in analytic number theory is to estimate sums of the form

$\displaystyle \sum_{p

as ${x \rightarrow \infty}$, where ${p}$ ranges over primes and ${f}$ is some explicit function of interest (e.g. a linear phase function ${f(p) = e^{2\pi i \alpha p}}$ for some real number ${\alpha}$). This is essentially the same task as obtaining estimates on the sum

$\displaystyle \sum_{n

where ${\Lambda}$ is the von Mangoldt function. If ${f}$ is bounded, ${f(n)=O(1)}$, then from the prime number theorem one has the trivial bound

$\displaystyle \sum_{n

but often (when ${f}$ is somehow “oscillatory” in nature) one is seeking the refinement

$\displaystyle \sum_{n

or equivalently

$\displaystyle \sum_{p

Thanks to identities such as

$\displaystyle \Lambda(n) = \sum_{d|n} \mu(d) \log(\frac{n}{d}), \ \ \ \ \ (3)$

where ${\mu}$ is the Möbius function, refinements such as (1) are similar in spirit to estimates of the form

$\displaystyle \sum_{n

Unfortunately, the connection between (1) and (4) is not particularly tight; roughly speaking, one needs to improve the bounds in (4) (and variants thereof) by about two factors of ${\log x}$ before one can use identities such as (3) to recover (1). Still, one generally thinks of (1) and (4) as being “morally” equivalent, even if they are not formally equivalent.

When ${f}$ is oscillating in a sufficiently “irrational” way, then one standard way to proceed is the method of Type I and Type II sums, which uses truncated versions of divisor identities such as (3) to expand out either (1) or (4) into linear (Type I) or bilinear sums (Type II) with which one can exploit the oscillation of ${f}$. For instance, Vaughan’s identity lets one rewrite the sum in (1) as the sum of the Type I sum

$\displaystyle \sum_{d \leq U} \mu(d) (\sum_{V/d \leq r \leq x/d} (\log r) f(rd)),$

the Type I sum

$\displaystyle -\sum_{d \leq UV} a(d) \sum_{V/d \leq r \leq x/d} f(rd),$

the Type II sum

$\displaystyle -\sum_{V \leq d \leq x/U} \sum_{U < m \leq x/V} \Lambda(d) b(m) f(dm),$

and the error term ${\sum_{d \leq V} \Lambda(n) f(n)}$, whenever ${1 \leq U, V \leq x}$ are parameters, and ${a, b}$ are the sequences

$\displaystyle a(d) := \sum_{e \leq U, f \leq V: ef = d} \Lambda(d) \mu(e)$

and

$\displaystyle b(m) := \sum_{d|m: d \leq U} \mu(d).$

Similarly one can express (4) as the Type I sum

$\displaystyle -\sum_{d \leq UV} c(d) \sum_{UV/d \leq r \leq x/d} f(rd),$

the Type II sum

$\displaystyle - \sum_{V < d \leq x/U} \sum_{U < m \leq x/d} \mu(m) b(d) f(dm)$

and the error term ${\sum_{d \leq UV} \mu(n) f(N)}$, whenever ${1 \leq U,V \leq x}$ with ${UV \leq x}$, and ${c}$ is the sequence

$\displaystyle c(d) := \sum_{e \leq U, f \leq V: ef = d} \mu(d) \mu(e).$

After eliminating troublesome sequences such as ${a(), b(), c()}$ via Cauchy-Schwarz or the triangle inequality, one is then faced with the task of estimating Type I sums such as

$\displaystyle \sum_{r \leq y} f(rd)$

or Type II sums such as

$\displaystyle \sum_{r \leq y} f(rd) \overline{f(rd')}$

for various ${y, d, d' \geq 1}$. Here, the trivial bound is ${O(y)}$, but due to a number of logarithmic inefficiencies in the above method, one has to obtain bounds that are more like ${O( \frac{y}{\log^C y})}$ for some constant ${C}$ (e.g. ${C=5}$) in order to end up with an asymptotic such as (1) or (4).

However, in a recent paper of Bourgain, Sarnak, and Ziegler, it was observed that as long as one is only seeking the Mobius orthogonality (4) rather than the von Mangoldt orthogonality (1), one can avoid losing any logarithmic factors, and rely purely on qualitative equidistribution properties of ${f}$. A special case of their orthogonality criterion (which actually dates back to an earlier paper of Katai, as was pointed out to me by Nikos Frantzikinakis) is as follows:

Proposition 1 (Orthogonality criterion) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a bounded function such that

$\displaystyle \sum_{n \leq x} f(pn) \overline{f(qn)} = o(x) \ \ \ \ \ (5)$

for any distinct primes ${p, q}$ (where the decay rate of the error term ${o(x)}$ may depend on ${p}$ and ${q}$). Then

$\displaystyle \sum_{n \leq x} \mu(n) f(n) =o(x). \ \ \ \ \ (6)$

Actually, the Bourgain-Sarnak-Ziegler paper establishes a more quantitative version of this proposition, in which ${\mu}$ can be replaced by an arbitrary bounded multiplicative function, but we will content ourselves with the above weaker special case. (See also these notes of Harper, which uses the Katai argument to give a slightly weaker quantitative bound in the same spirit.) This criterion can be viewed as a multiplicative variant of the classical van der Corput lemma, which in our notation asserts that ${\sum_{n \leq x} f(n) = o(x)}$ if one has ${\sum_{n \leq x} f(n+h) \overline{f(n)} = o(x)}$ for each fixed non-zero ${h}$.

As a sample application, Proposition 1 easily gives a proof of the asymptotic

$\displaystyle \sum_{n \leq x} \mu(n) e^{2\pi i \alpha n} = o(x)$

for any irrational ${\alpha}$. (For rational ${\alpha}$, this is a little trickier, as it is basically equivalent to the prime number theorem in arithmetic progressions.) The paper of Bourgain, Sarnak, and Ziegler also apply this criterion to nilsequences (obtaining a quick proof of a qualitative version of a result of Ben Green and myself, see these notes of Ziegler for details) and to horocycle flows (for which no Möbius orthogonality result was previously known).

Informally, the connection between (5) and (6) comes from the multiplicative nature of the Möbius function. If (6) failed, then ${\mu(n)}$ exhibits strong correlation with ${f(n)}$; by change of variables, we then expect ${\mu(pn)}$ to correlate with ${f(pn)}$ and ${\mu(pm)}$ to correlate with ${f(qn)}$, for “typical” ${p,q}$ at least. On the other hand, since ${\mu}$ is multiplicative, ${\mu(pn)}$ exhibits strong correlation with ${\mu(qn)}$. Putting all this together (and pretending correlation is transitive), this would give the claim (in the contrapositive). Of course, correlation is not quite transitive, but it turns out that one can use the Cauchy-Schwarz inequality as a substitute for transitivity of correlation in this case.

I will give a proof of Proposition 1 below the fold (which is not quite based on the argument in the above mentioned paper, but on a variant of that argument communicated to me by Tamar Ziegler, and also independently discovered by Adam Harper). The main idea is to exploit the following observation: if ${P}$ is a “large” but finite set of primes (in the sense that the sum ${A := \sum_{p \in P} \frac{1}{p}}$ is large), then for a typical large number ${n}$ (much larger than the elements of ${P}$), the number of primes in ${P}$ that divide ${n}$ is pretty close to ${A = \sum_{p \in P} \frac{1}{p}}$:

$\displaystyle \sum_{p \in P: p|n} 1 \approx A. \ \ \ \ \ (7)$

A more precise formalisation of this heuristic is provided by the Turan-Kubilius inequality, which is proven by a simple application of the second moment method.

In particular, one can sum (7) against ${\mu(n) f(n)}$ and obtain an approximation

$\displaystyle \sum_{n \leq x} \mu(n) f(n) \approx \frac{1}{A} \sum_{p \in P} \sum_{n \leq x: p|n} \mu(n) f(n)$

that approximates a sum of ${\mu(n) f(n)}$ by a bunch of sparser sums of ${\mu(n) f(n)}$. Since

$\displaystyle x = \frac{1}{A} \sum_{p \in P} \frac{x}{p},$

we see (heuristically, at least) that in order to establish (4), it would suffice to establish the sparser estimates

$\displaystyle \sum_{n \leq x: p|n} \mu(n) f(n) = o(\frac{x}{p})$

for all ${p \in P}$ (or at least for “most” ${p \in P}$).

Now we make the change of variables ${n = pm}$. As the Möbius function is multiplicative, we usually have ${\mu(n) = \mu(p) \mu(m) = - \mu(m)}$. (There is an exception when ${n}$ is divisible by ${p^2}$, but this will be a rare event and we will be able to ignore it.) So it should suffice to show that

$\displaystyle \sum_{m \leq x/p} \mu(m) f(pm) = o( x/p )$

for most ${p \in P}$. However, by the hypothesis (5), the sequences ${m \mapsto f(pm)}$ are asymptotically orthogonal as ${p}$ varies, and this claim will then follow from a Cauchy-Schwarz argument.

Tamar Ziegler and I have just uploaded to the arXiv our paper “The inverse conjecture for the Gowers norm over finite fields in low characteristic“, submitted to Annals of Combinatorics. This paper completes another case of the inverse conjecture for the Gowers norm, this time for vector spaces ${{\bf F}^n}$ over a fixed finite field ${{\bf F} = {\bf F}_p}$ of prime order; with Vitaly Bergelson, we had previously established this claim when the characteristic of the field was large, so the main new result here is the extension to the low characteristic case. (The case of a cyclic group ${{\bf Z}/N{\bf Z}}$ or interval ${[N]}$ was established by Ben Green and ourselves in another recent paper. For an arbitrary abelian (or nilpotent) group, a general but less explicit description of the obstructions to Gowers uniformity was recently obtained by Szegedy; the latter result recovers the high-characteristic case of our result (as was done in a subsequent paper of Szegedy), as well as our results with Green, but it is not immediately evident whether Szegedy’s description of the obstructions matches up with the one predicted by the inverse conjecture in low characteristic.)

The statement of the main theorem is as follows. Given a finite-dimensional vector space ${V = {\bf F}^n}$ and a function ${f: V \rightarrow {\bf C}}$, and an integer ${s \geq 0}$, one can define the Gowers uniformity norm ${\|f\|_{U^{s+1}(V)}}$ by the formula

$\displaystyle \|f\|_{U^{s+1}(V)} := \left( \mathop{\bf E}_{x,h_1,\ldots,h_{s+1} \in V} \Delta_{h_1} \ldots \Delta_{h_{s+1}} f(x) \right)^{1/2^{s+1}}$

where ${\Delta_h f(x) := f(x+h) \overline{f(x)}}$. If ${f}$ is bounded in magnitude by ${1}$, it is easy to see that ${\|f\|_{U^{s+1}(V)}}$ is bounded by ${1}$ also, with equality if and only if ${f(x) = e(P)}$ for some non-classical polynomial ${P: V \rightarrow {\bf R}/{\bf Z}}$ of degree at most ${s}$, where ${e(x) := e^{2\pi ix}}$, and a non-classical polynomial of degree at most ${s}$ is a function whose ${s+1^{th}}$ “derivatives” vanish in the sense that

$\displaystyle \partial_{h_1} \ldots \partial_{h_{s+1}} P(x) = 0$

for all ${x,h_1,\ldots,h_{s+1} \in V}$, where ${\partial_h P(x) := P(x+h) - P(x)}$. Our result generalises this to the case when the uniformity norm is not equal to ${1}$, but is still bounded away from zero:

Theorem 1 (Inverse conjecture) Let ${f: V \rightarrow {\bf C}}$ be bounded by ${1}$ with ${\|f\|_{U^{s+1}(V)} \geq \delta > 0}$ for some ${s \geq 0}$. Then there exists a non-classical polynomial ${P: V \rightarrow {\bf R}/{\bf Z}}$ of degree at most ${s}$ such that ${|\langle f, e(P) \rangle_{L^2(V)}| := |{\bf E}_{x \in V} f(x) e(-P(x))| \geq c(s,p, \delta) > 0}$, where ${c(s,p, \delta)}$ is a positive quantity depending only on the indicated parameters.

This theorem is trivial for ${s=0}$, and follows easily from Fourier analysis for ${s=1}$. The case ${s=2}$ was done in odd characteristic by Ben Green and myself, and in even characteristic by Samorodnitsky. In two papers, one with Vitaly Bergelson, we established this theorem in the “high characteristic” case when the characteristic ${p}$ of ${{\bf F}}$ was greater than ${s}$ (in which case there is essentially no distinction between non-classical polynomials and their classical counterparts, as discussed previously on this blog). The need to deal with genuinely non-classical polynomials is the main new difficulty in this paper that was not dealt with in previous literature.

In our previous paper with Bergelson, a “weak” version of the above theorem was proven, in which the polynomial ${P}$ in the conclusion had bounded degree ${O_{s,p}(1)}$, rather than being of degree at most ${s}$. In the current paper, we use this weak inverse theorem to reduce the inverse conjecture to a statement purely about polynomials:

Theorem 2 (Inverse conjecture for polynomials) Let ${s \geq 0}$, and let ${P: V \rightarrow {\bf C}}$ be a non-classical polynomial of degree at most ${s+1}$ such that ${\|e(P)\|_{U^{s+1}(V)} \geq \delta > 0}$. Then ${P}$ has bounded rank in the sense that ${P}$ is a function of ${O_{s,p,\delta}(1)}$ polynomials of degree at most ${s}$.

This type of inverse theorem was first introduced by Bogdanov and Viola. The deduction of Theorem 1 from Theorem 2 and the weak inverse Gowers conjecture is fairly standard, so the main difficulty is to show Theorem 2.

The quantity ${-\log_{|{\bf F}|} \|e(P)\|_{U^{s+1}(V)}^{1/2^{s+1}}}$ of a polynomial ${P}$ of degree at most ${s+1}$ was denoted the analytic rank of ${P}$ by Gowers and Wolf. They observed that the analytic rank of ${P}$ was closely related to the rank of ${P}$, defined as the least number of degree ${s}$ polynomials needed to express ${P}$. For instance, in the quadratic case ${s=1}$ the two ranks are identical (in odd characteristic, at least). For general ${s}$, it was easy to see that bounded rank implied bounded analytic rank; Theorem 2 is the converse statement.

We tried a number of ways to show that bounded analytic rank implied bounded rank, in particular spending a lot of time on ergodic-theoretic approaches, but eventually we settled on a “brute force” approach that relies on classifying those polynomials of bounded analytic rank as precisely as possible. The argument splits up into establishing three separate facts:

1. (Classical case) If a classical polynomial has bounded analytic rank, then it has bounded rank.
2. (Multiplication by ${p}$) If a non-classical polynomial ${P}$ (of degree at most ${s+1}$) has bounded analytic rank, then ${pP}$ (which can be shown to have degree at most ${\max(s-p,0)}$) also has bounded analytic rank.
3. (Division by ${p}$) If ${Q}$ is a non-clsasical polynomial of degree ${\max(s-p,0)}$ of bounded rank, then there is a non-classical polynomial ${P}$ of degree at most ${s+1}$ of bounded rank such that ${pQ=P}$.

The multiplication by ${p}$ and division by ${p}$ facts allow one to easily extend the classical case of the theorem to the non-classical case of the theorem, basically because classical polynomials are the kernel of the multiplication-by-${p}$ homomorphism. Indeed, if ${P}$ is a non-classical polynomial of bounded analytic rank of the right degree, then the multiplication by ${p}$ claim tells us that ${pP}$ also has bounded analytic rank, which by an induction hypothesis implies that ${pP}$ has bounded rank. Applying the division by ${p}$ claim, we find a bounded rank polynomial ${P'}$ such that ${pP = pP'}$, thus ${P}$ differs from ${P'}$ by a classical polynomial, which necessarily has bounded analytic rank, hence bounded rank by the classical claim, and the claim follows.

Of the three claims, the multiplication-by-${p}$ claim is the easiest to prove using known results; after a bit of Fourier analysis, it turns out to follow more or less immediately from the multidimensional Szemerédi theorem over finite fields of Bergelson, Leibman, and McCutcheon (one can also use the density Hales-Jewett theorem here if one desires).

The next easiest claim is the classical case. Here, the idea is to analyse a degree ${s+1}$ classical polynomial ${P: V \rightarrow {\bf F}}$ via its derivative ${d^{s+1} P: V^{s+1} \rightarrow {\bf F}}$, defined by the formula

$\displaystyle d^{s+1} P( h_1,\ldots,h_{s+1}) := \partial_{h_1} \ldots \partial_{h_{s+1}} P(x)$

for any ${x,h_1,\ldots,h_{s+1} \in V}$ (the RHS is independent of ${x}$ as ${P}$ has degree ${s+1}$). This is a multilinear form, and if ${P}$ has bounded analytic rank, this form is biased (in the sense that the mean of ${e(d^{s+1} P)}$ is large). Applying a general equidistribution theorem of Kaufman and Lovett (based on this earlier paper of Green and myself) this implies that ${d^{s+1} P}$ is a function of a bounded number of multilinear forms of lower degree. Using some “regularity lemma” theory to clean up these forms so that they have good equidistribution properties, it is possible to understand exactly how the original multilinear form ${d^{s+1} P}$ depends on these lower degree forms; indeed, the description one eventually obtains is so explicit that one can write down by inspection another bounded rank polynomial ${Q}$ such that ${d^{s+1} P}$ is equal to ${d^{s+1} Q}$. Thus ${P}$ differs from the bounded rank polynomial ${Q}$ by a lower degree error, which is automatically of bounded rank also, and the claim follows.

The trickiest thing to establish is the division by ${p}$ claim. The polynomial ${Q}$ is some function ${F(R_1,\ldots,R_m)}$ of lower degree polynomials ${R_1,\ldots,R_m}$. Ideally, one would like to find a function ${F'(R_1,\ldots,R_m)}$ of the same polynomials with ${pF' = F}$, such that ${F'(R_1,\ldots,R_m)}$ has the correct degree; however, we have counterexamples that show that this is not always possible. (These counterexamples are the main obstruction to making the ergodic theory approach work: in ergodic theory, one is only allowed to work with “measurable” functions, which are roughly analogous in this context to functions of the indicated polynomials ${Q, R_1,\ldots,R_m}$ and their shifts.) To get around this we have to first apply a regularity lemma to place ${R_1,\ldots,R_m}$ in a suitably equidistributed form (although the fact that ${R_1,\ldots,R_m}$ may be non-classical leads to a rather messy and technical description of this equidistribution), and then we have to extend each ${R_j}$ to a higher degree polynomial ${R'_j}$ with ${pR'_j = R_j}$. There is a crucial “exact roots” property of polynomials that allows one to do this, with ${R'_j}$ having degree exactly ${p-1}$ higher than ${R_j}$. It turns out that it is possible to find a function ${P = F'(R'_1,\ldots,R'_m)}$ of these extended polynomials that have the right degree and which solves the required equation ${pP=Q}$; this is established by classifying completely all functions of the equidistributed polynomials ${R_1,\ldots,R_m}$ or ${R'_1,\ldots,R'_m}$ that are of a given degree.

Ben Green, Tamar Ziegler, and I have just uploaded to the arXiv our paper “An inverse theorem for the Gowers U^{s+1}[N] norm“, which was previously announced on this blog.  We are still planning one final round of reviewing the preprint before submitting the paper, but it has gotten to the stage where we are comfortable with having the paper available on the arXiv.

The main result of the paper is to establish the inverse conjecture for the Gowers norm over the integers, which has a number of applications, in particular to counting solutions to various linear equations in primes.  In spirit, the proof of the paper follows the 21-page announcement that was uploaded previously.  However, for various rather annoying technical reasons, the 117-page paper has to devote a large amount of space to setting up various bits of auxiliary machinery (as well as a dozen or so pages worth of examples and discussion).  For instance, the announcement motivates many of the steps of the argument by heuristically identifying nilsequences $n \mapsto F(g(n) \Gamma)$ with bracket polynomial phases such as $n \mapsto e( \{ \alpha n \} \beta n )$.  However, a rather significant amount of theory (which was already worked out to a large extent by Leibman) is needed to formalise the “bracket algebra” needed to manipulate such bracket polynomials and to connect them with nilsequences.  Furthermore, the “piecewise smooth” nature of bracket polynomials causes some technical issues with the equidistribution theory for these sequences.  Our original version of the paper (which was even longer than the current version) set out this theory.  But we eventually decided that it was best to eschew almost all use of bracket polynomials (except as motivation and examples), and run the argument almost entirely within the language of nilsequences, to keep the argument a bit more notationally focused (and to make the equidistribution theory easier to establish).  But this was not without a tradeoff; some statements that are almost trivially true for bracket polynomials, required some “nilpotent algebra” to convert to the language of nilsequences.  Here are some examples of this:

1. It is intuitively clear that a bracket polynomial phase e(P(n)) of degree k in one variable n can be “multilinearised” to a polynomial $e(Q(n_1,\ldots,n_k))$ of multi-degree $(1,\ldots,1)$ in k variables $n_1,\ldots,n_k$, such that $e(P(n))$ and $e(Q(n,\ldots,n))$ agree modulo lower order terms.  For instance, if $e(P(n)) = e(\alpha n \{ \beta n \{ \gamma n \} \})$ (so k=3), then one could take $e(Q(n_1,n_2,n_3)) = e( \alpha n_1 \{ \beta n_2 \{ \gamma n_3 \} \})$.   The analogue of this statement for nilsequences is true, but required a moderately complicated nilpotent algebra construction using the Baker-Campbell-Hausdorff formula.
2. Suppose one has a bracket polynomial phase e(P_h(n)) of degree k in one variable n that depends on an additional parameter h, in such a way that exactly one of the coefficients in each monomial depends on h.  Furthermore, suppose this dependence is bracket linear in h.  Then it is intuitively clear that this phase can be rewritten (modulo lower order terms) as e( Q(h,n) ) where Q is a bracket polynomial of multidegree (1,k) in (h,n).  For instance, if $e(P_h(n)) = e( \{ \alpha_h n \} \beta n )$ and $\alpha_h = \{\gamma h \} \delta$, then we can take $e(Q(h,n)) = e(\{ \{\gamma h\} \delta n\} \beta n )$.  The nilpotent algebra analogue of this claim is true, but requires another moderately complicated nilpotent algebra construction based on semi-direct products.
3. A bracket polynomial has a fairly visible concept of a “degree” (analogous to the corresponding notion for true polynomials), as well as a “rank” (which, roughly speaking measures the number of parentheses in the bracket monomials, plus one).  Thus, for instance, the bracket monomial $\{\{ \alpha n^4 \} \beta n \} \gamma n^2$ has degree 7 and rank 3.  Defining degree and rank for nilsequences requires one to generalise the notion of a (filtered) nilmanifold to one in which the lower central series is replaced by a filtration indexed by both the degree and the rank.

There are various other tradeoffs of this type in this paper.  For instance, nonstandard analysis tools were introduced to eliminate what would otherwise be quite a large number of epsilons and regularity lemmas to manage, at the cost of some notational overhead; and the piecewise discontinuities mentioned earlier were eliminated by the use of vector-valued nilsequences, though this again caused some further notational overhead.    These difficulties may be a sign that we do not yet have the “right” proof of this conjecture, but one will probably have to wait a few years before we get a proper amount of perspective and understanding on this circle of ideas and results.

Ben Green, Tamar Ziegler, and I have just uploaded to the arXiv the note “An inverse theorem for the Gowers norm ${U^{s+1}[N]}$ (announcement)“, not intended for publication. This is an announcement of our forthcoming solution of the inverse conjecture for the Gowers norm, which roughly speaking asserts that ${U^{s+1}[N]}$ norm of a bounded function is large if and only if that function correlates with an ${s}$-step nilsequence of bounded complexity.

The full argument is quite lengthy (our most recent draft is about 90 pages long), but this is in large part due to the presence of various technical details which are necessary in order to make the argument fully rigorous. In this 20-page announcement, we instead sketch a heuristic proof of the conjecture, relying in a number of “cheats” to avoid the above-mentioned technical details. In particular:

• In the announcement, we rely on somewhat vaguely defined terms such as “bounded complexity” or “linearly independent with respect to bounded linear combinations” or “equivalent modulo lower step errors” without specifying them rigorously. In the full paper we will use the machinery of nonstandard analysis to rigorously and precisely define these concepts.
• In the announcement, we deal with the traditional linear nilsequences rather than the polynomial nilsequences that turn out to be better suited for finitary equidistribution theory, but require more notation and machinery in order to use.
• In a similar vein, we restrict attention to scalar-valued nilsequences in the announcement, though due to topological obstructions arising from the twisted nature of the torus bundles used to build nilmanifolds, we will have to deal instead with vector-valued nilsequences in the main paper.
• In the announcement, we pretend that nilsequences can be described by bracket polynomial phases, at least for the sake of making examples, although strictly speaking bracket polynomial phases only give examples of piecewise Lipschitz nilsequences rather than genuinely Lipschitz nilsequences.

With these cheats, it becomes possible to shorten the length of the argument substantially. Also, it becomes clearer that the main task is a cohomological one; in order to inductively deduce the inverse conjecture for a given step ${s}$ from the conjecture for the preceding step ${s-1}$, the basic problem is to show that a certain (quasi-)cocycle is necessarily a (quasi-)coboundary. This in turn requires a detailed analysis of the top order and second-to-top order terms in the cocycle, which requires a certain amount of nilsequence equidistribution theory and additive combinatorics, as well as a “sunflower decomposition” to arrange the various nilsequences one encounters into a usable “normal form”.

It is often the case in modern mathematics that the informal heuristic way to explain an argument looks quite different (and is significantly shorter) than the way one would formally present the argument with all the details. This seems to be particularly true in this case; at a superficial level, the full paper has a very different set of notation than the announcement, and a lot of space is invested in setting up additional machinery that one can quickly gloss over in the announcement. We hope though that the announcement can provide a “road map” to help navigate the much longer paper to come.

Ben Green, Tamar Ziegler and I have just uploaded to the arXiv our paper “An inverse theorem for the Gowers $U^4$ norm“.  This paper establishes the next case of the inverse conjecture for the Gowers norm for the integers (after the $U^3$ case, which was done by Ben and myself a few years ago).  This conjecture has a number of combinatorial and number-theoretic consequences, for instance by combining this new inverse theorem with previous results, one can now get the correct asymptotic for the number of arithmetic progressions of primes of length five in any large interval $[N] = \{1,\ldots,N\}$.

To state the inverse conjecture properly requires a certain amount of notation.  Given a function $f: {\Bbb Z} \to {\Bbb C}$ and a shift $h \in {\Bbb Z}$, define the multiplicative derivative

$\Delta_h f(x) := f(x+h) \overline{f(x)}$

and then define the Gowers $U^{s+1}[N]$ norm of a function $f: [N] \to {\Bbb C}$ to (essentially) be the quantity

$\| f\|_{U^{s+1}[N]} := ({\Bbb E}_{h_1,\ldots,h_{s+1} \in [-N,N]} {\Bbb E}_{x \in [N]} |\Delta_{h_1} \ldots \Delta_{h_{s+1}} f(x)|)^{1/2^{s+1}},$

where we extend f by zero outside of $[N]$.  (Actually, we use a slightly different normalisation to ensure that the function 1 has a $U^{s+1}$ norm of 1, but never mind this for now.)

Informally, the Gowers norm $\|f\|_{U^{s+1}[N]}$ measures the amount of bias present in the $(s+1)^{st}$ multiplicative derivatives of $f$.  In particular, if $f = e(P) := e^{2\pi i P}$ for some polynomial $P: {\Bbb Z} \to {\Bbb C}$, then the $(s+1)^{th}$ derivative of $f$ is identically 1, and so is the Gowers norm.

However, polynomial phases are not the only functions with large Gowers norm.  For instance, consider the function $f(n) := e( \lfloor \sqrt{2} n \rfloor \sqrt{3} n )$, which is what we call a quadratic bracket polynomial phase.  This function isn’t quite quadratic, but it is close enough to being quadratic (because one has the approximate linearity relationship $\lfloor x+y \rfloor = \lfloor x \rfloor + \lfloor y \rfloor$ holding a good fraction of the time) that it turns out that third derivative is trivial fairly often, and the Gowers norm $\|f\|_{U^3[N]}$ is comparable to 1.  This bracket polynomial phase can be modeled as a nilsequence $n \mapsto F( g(n) \Gamma )$, where $n \mapsto g(n) \Gamma$ is a polynomial orbit on a nilmanifold $G/\Gamma$, which in this case has step 2.  (The function F is only piecewise smooth, due to the discontinuity in the floor function $\lfloor \rfloor$, so strictly speaking we would classify this as an almost nilsequence rather than a nilsequence, but let us ignore this technical issue here.)  In fact, there is a very close relationship between nilsequences and bracket polynomial phases, but I will detail this in a later post.

The inverse conjecture for the Gowers norm, GI(s), asserts that such nilsequences are the only obstruction to the Gowers norm being small.  Roughly speaking, it goes like this:

Inverse conjecture, GI(s). (Informal statement)  Suppose that $f: [N] \to {\Bbb C}$ is bounded but has large $U^{s+1}[N]$ norm.  Then there is an s-step nilsequence $n \mapsto F( g(n) \Gamma )$ of “bounded complexity” that correlates with f.

This conjecture is trivial for s=0, is a short consequence of Fourier analysis when s=1, and was proven for s=2 by Ben and myself.  In this paper we establish the s=3 case.  An equivalent formulation in this case is that any bounded function $f$ of large $U^4$ norm must correlate with a “bracket cubic phase”, which is the product of a bounded number of phases from the following list

$e( \alpha n^3 + \beta n^2 + \gamma n), e( \lfloor \alpha n \rfloor \beta n^2 ), e( \lfloor \alpha n \rfloor \lfloor \beta n \rfloor \gamma n ), e( \lfloor \alpha n \rfloor \beta n )$ (*)

for various real numbers $\alpha,\beta,\gamma$.

It appears that our methods also work in higher step, though for technical reasons it is convenient to make a number of adjustments to our arguments to do so, most notably a switch from standard analysis to non-standard analysis, about which I hope to say more later.  But there are a number of simplifications available on the s=3 case which make the argument significantly shorter, and so we will be writing the higher s argument in a separate paper.

The arguments largely follow those for the s=2 case (which in turn are based on this paper of Gowers).  Two major new ingredients are a deployment of a normal form and equidistribution theory for bracket quadratic phases, and a combinatorial decomposition of frequency space which we call the sunflower decomposition.  I will sketch these ideas below the fold.

Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our paper “An inverse theorem for the uniformity seminorms associated with the action of $F^\infty_p$“. This paper establishes the ergodic inverse theorems that are needed in our other recent paper to establish the inverse conjecture for the Gowers norms over finite fields in high characteristic (and to establish a partial result in low characteristic), as follows:

Theorem. Let ${\Bbb F}$ be a finite field of characteristic p.  Suppose that $X = (X,{\mathcal B},\mu)$ is a probability space with an ergodic measure-preserving action $(T_g)_{g \in {\Bbb F}^\omega}$ of ${\Bbb F}^\omega$.  Let $f \in L^\infty(X)$ be such that the Gowers-Host-Kra seminorm $\|f\|_{U^k(X)}$ (defined in a previous post) is non-zero.

1. In the high-characteristic case $p \geq k$, there exists a phase polynomial g of degree <k (as defined in the previous post) such that $|\int_X f \overline{g}\ d\mu| > 0$.
2. In general characteristic, there exists a phase polynomial of degree <C(k) for some C(k) depending only on k such that $|\int_X f \overline{g}\ d\mu| > 0$.

This theorem is closely analogous to a similar theorem of Host and Kra on ergodic actions of ${\Bbb Z}$, in which the role of phase polynomials is played by functions that arise from nilsystem factors of X.  Indeed, our arguments rely heavily on the machinery of Host and Kra.

The paper is rather technical (60+ pages!) and difficult to describe in detail here, but I will try to sketch out (in very broad brush strokes) what the key steps in the proof of part 2 of the theorem are.  (Part 1 is similar but requires a more delicate analysis at various stages, keeping more careful track of the degrees of various polynomials.)

Let $k \geq 0$ be an integer.  The concept of a polynomial $P: {\Bbb R} \to {\Bbb R}$ of one variable of degree $ (or $\leq k-1$) can be defined in one of two equivalent ways:

• (Global definition) $P: {\Bbb R} \to {\Bbb R}$ is a polynomial of degree $ iff it can be written in the form $P(x) = \sum_{0 \leq j < k} c_j x^j$ for some coefficients $c_j \in {\Bbb R}$.
• (Local definition) $P: {\Bbb R} \to {\Bbb R}$ is a polynomial of degree $ if it is k-times continuously differentiable and $\frac{d^k}{dx^k} P \equiv 0$.

From single variable calculus we know that if P is a polynomial in the global sense, then it is a polynomial in the local sense; conversely, if P is a polynomial in the local sense, then from the Taylor series expansion

$\displaystyle P(x) = \sum_{0 \leq j < k} \frac{P^{(j)}(0)}{j!} x^j$

we see that P is a polynomial in the global sense. We make the trivial remark that we have no difficulty dividing by $j!$ here, because the field ${\Bbb R}$ is of characteristic zero.

The above equivalence carries over to higher dimensions:

• (Global definition) $P: {\Bbb R}^n \to {\Bbb R}$ is a polynomial of degree $ iff it can be written in the form $P(x_1,\ldots,x_n) = \sum_{0 \leq j_1,\ldots,j_n; j_1+\ldots+j_n < k} c_{j_1,\ldots,j_n} x_1^{j_1} \ldots x_n^{j_n}$ for some coefficients $c_{j_1,\ldots,j_n} \in {\Bbb R}$.
• (Local definition) $P: {\Bbb R}^n \to {\Bbb R}$ is a polynomial of degree $ if it is k-times continuously differentiable and $(h_1 \cdot \nabla) \ldots (h_k \cdot \nabla) P \equiv 0$ for all $h_1,\ldots,h_k \in {\Bbb R}^n$.

Again, it is not difficult to use several variable calculus to show that these two definitions of a polynomial are equivalent.

The purpose of this (somewhat technical) post here is to record some basic analogues of the above facts in finite characteristic, in which the underlying domain of the polynomial P is F or $F^n$ for some finite field F.  In the “classical” case when the range of P is also the field F, it is a well-known fact (which we reproduce here) that the local and global definitions of polynomial are equivalent.  But in the “non-classical” case, when P ranges in a more general group (and in particular in the unit circle ${\Bbb R}/{\Bbb Z}$), the global definition needs to be corrected somewhat by adding some new monomials to the classical ones $x_1^{j_1} \ldots x_n^{j_n}$.  Once one does this, one can recover the equivalence between the local and global definitions.

(The results here are derived from forthcoming work with Vitaly Bergelson and Tamar Ziegler.)

Tamar Ziegler and I have just uploaded to the arXiv our paper, “The inverse conjecture for the Gowers norm over finite fields via the correspondence principle“, submitted to Analysis & PDE.  As announced a few months ago in this blog post, this paper establishes (most of) the inverse conjecture for the Gowers norm from an ergodic theory analogue of this conjecture (in a forthcoming paper by Vitaly Bergelson, Tamar Ziegler, and myself, which should be ready shortly), using a variant of the Furstenberg correspondence principle.  Our papers were held up for a while due to some unexpected technical difficulties arising in the low characteristic case; as a consequence, our paper only establishes the full inverse conjecture in the high characteristic case $p \geq k$, and gives a partial result in the low characteristic case $p < k$.

In the rest of this post, I would like to describe the inverse conjecture (in both combinatorial and ergodic forms), and sketch how one deduces one from the other via the correspondence principle (together with two additional ingredients, namely a statistical sampling lemma and a local testability result for polynomials).