You are currently browsing the tag archive for the ‘Gowers uniformity norms’ tag.

The von Neumann ergodic theorem (the Hilbert space version of the mean ergodic theorem) asserts that if is a unitary operator on a Hilbert space , and is a vector in that Hilbert space, then one has

in the strong topology, where is the -invariant subspace of , and is the orthogonal projection to . (See e.g. these previous lecture notes for a proof.) The same proof extends to more general amenable groups: if is a countable amenable group acting on a Hilbert space by unitary transformations for , and is a vector in that Hilbert space, then one has

for any Folner sequence of , where is the -invariant subspace, and is the average of on . Thus one can interpret as a certain average of elements of the orbit of .

In a previous blog post, I noted a variant of this ergodic theorem (due to Alaoglu and Birkhoff) that holds even when the group is not amenable (or not discrete), using a more abstract notion of averaging:

Theorem 1 (Abstract ergodic theorem)Let be an arbitrary group acting unitarily on a Hilbert space , and let be a vector in . Then is the element in the closed convex hull of of minimal norm, and is also the unique element of in this closed convex hull.

I recently stumbled upon a different way to think about this theorem, in the additive case when is abelian, which has a closer resemblance to the classical mean ergodic theorem. Given an arbitrary additive group (not necessarily discrete, or countable), let denote the collection of finite non-empty multisets in – that is to say, unordered collections of elements of , not necessarily distinct, for some positive integer . Given two multisets , in , we can form the sum set . Note that the sum set can contain multiplicity even when do not; for instance, . Given a multiset in , and a function from to a vector space , we define the average as

Note that the multiplicity function of the set affects the average; for instance, we have , but .

We can define a directed set on as follows: given two multisets , we write if we have for some . Thus for instance we have . It is easy to verify that this operation is transitive and reflexive, and is directed because any two elements of have a common upper bound, namely . (This is where we need to be abelian.) The notion of convergence along a net, now allows us to define the notion of convergence along ; given a family of points in a topological space indexed by elements of , and a point in , we say that *converges* to along if, for every open neighbourhood of in , one has for sufficiently large , that is to say there exists such that for all . If the topological space is Hausdorff, then the limit is unique (if it exists), and we then write

When takes values in the reals, one can also define the limit superior or limit inferior along such nets in the obvious fashion.

We can then give an alternate formulation of the abstract ergodic theorem in the abelian case:

Theorem 2 (Abelian abstract ergodic theorem)Let be an arbitrary additive group acting unitarily on a Hilbert space , and let be a vector in . Then we havein the strong topology of .

*Proof:* Suppose that , so that for some , then

so by unitarity and the triangle inequality we have

thus is monotone non-increasing in . Since this quantity is bounded between and , we conclude that the limit exists. Thus, for any , we have for sufficiently large that

for all . In particular, for any , we have

We can write

and so from the parallelogram law and unitarity we have

for all , and hence by the triangle inequality (averaging over a finite multiset )

for any . This shows that is a Cauchy sequence in (in the strong topology), and hence (by the completeness of ) tends to a limit. Shifting by a group element , we have

and hence is invariant under shifts, and thus lies in . On the other hand, for any and , we have

and thus on taking strong limits

and so is orthogonal to . Combining these two facts we see that is equal to as claimed.

To relate this result to the classical ergodic theorem, we observe

Lemma 3Let be a countable additive group, with a F{\o}lner sequence , and let be a bounded sequence in a normed vector space indexed by . If exists, then exists, and the two limits are equal.

*Proof:* From the F{\o}lner property, we see that for any and any , the averages and differ by at most in norm if is sufficiently large depending on , (and the ). On the other hand, by the existence of the limit , the averages and differ by at most in norm if is sufficiently large depending on (regardless of how large is). The claim follows.

It turns out that this approach can also be used as an alternate way to construct the Gowers–Host-Kra seminorms in ergodic theory, which has the feature that it does not explicitly require any amenability on the group (or separability on the underlying measure space), though, as pointed out to me in comments, even uncountable abelian groups are amenable in the sense of possessing an invariant mean, even if they do not have a F{\o}lner sequence.

Given an arbitrary additive group , define a *-system* to be a probability space (not necessarily separable or standard Borel), together with a collection of invertible, measure-preserving maps, such that is the identity and (modulo null sets) for all . This then gives isomorphisms for by setting . From the above abstract ergodic theorem, we see that

in the strong topology of for any , where is the collection of measurable sets that are essentially -invariant in the sense that modulo null sets for all , and is the conditional expectation of with respect to .

In a similar spirit, we have

Theorem 4 (Convergence of Gowers-Host-Kra seminorms)Let be a -system for some additive group . Let be a natural number, and for every , let , which for simplicity we take to be real-valued. Then the expressionconverges, where we write , and we are using the product direct set on to define the convergence . In particular, for , the limit

converges.

We prove this theorem below the fold. It implies a number of other known descriptions of the Gowers-Host-Kra seminorms , for instance that

for , while from the ergodic theorem we have

This definition also manifestly demonstrates the cube symmetries of the Host-Kra measures on , defined via duality by requiring that

In a subsequent blog post I hope to present a more detailed study of the norm and its relationship with eigenfunctions and the Kronecker factor, without assuming any amenability on or any separability or topological structure on .

I’ve just finished writing the first draft of my third book coming out of the 2010 blog posts, namely “Higher order Fourier analysis“, which was based primarily on my graduate course in the topic, though it also contains material from some additional posts related to linear and higher order Fourier analysis on the blog. It is available online here. As usual, comments and corrections are welcome. There is also a stub page for the book, which at present does not contain much more than the above link.

Tanja Eisner and I have just uploaded to the arXiv our paper “Large values of the Gowers-Host-Kra seminorms“, submitted to Journal d’Analyse Mathematique. This paper is concerned with the properties of three closely related families of (semi)norms, indexed by a positive integer :

- The
*Gowers uniformity norms*of a (bounded, measurable, compactly supported) function taking values on a locally compact abelian group , equipped with a Haar measure ; - The Gowers uniformity norms of a function on a discrete interval ; and
- The Gowers-Host-Kra seminorms of a function on an ergodic measure-preserving system .

These norms have been discussed in depth in previous blog posts, so I will just quickly review the definition of the first norm here (the other two (semi)norms are defined similarly). The norm is defined recursively by setting

and

where . Equivalently, one has

Informally, the Gowers uniformity norm measures the extent to which (the phase of ) behaves like a polynomial of degree less than . Indeed, if and is compact with normalised Haar measure , it is not difficult to show that is at most , with equality if and only if takes the form almost everywhere, where is a polynomial of degree less than (which means that for all ).

Our first result is to show that this result is robust, uniformly over all choices of group :

Theorem 1 (-near extremisers)Let be a compact abelian group with normalised Haar measure , and let be such that and for some and . Then there exists a polynomial of degree at most such that , where is bounded by a quantity that goes to zero as for fixed .

The quantity can be described effectively (it is of polynomial size in ), but we did not seek to optimise it here. This result was already known in the case of vector spaces over a fixed finite field (where it is essentially equivalent to the assertion that the property of being a polynomial of degree at most is locally testable); the extension to general groups turns out to fairly routine. The basic idea is to use the recursive structure of the Gowers norms, which tells us in particular that if is close to one, then is close to one for most , which by induction implies that is close to for some polynomials of degree at most and for most . (Actually, it is not difficult to use cocycle equations such as (when ) to upgrade “for most ” to “for all “.) To finish the job, one would like to express the as derivatives of a polynomial of degree at most . This turns out to be equivalent to requiring that the obey the cocycle equation

where is the translate of by . (In the paper, the sign conventions are reversed, so that , in order to be compatible with ergodic theory notation, but this makes no substantial difference to the arguments or results.) However, one does not quite get this right away; instead, by using some separation properties of polynomials, one can show the weaker statement that

where the are small real constants. To eliminate these constants, one exploits the trivial cohomology of the real line. From (1) one soon concludes that the obey the -cocycle equation

and an averaging argument then shows that is a -coboundary in the sense that

for some small scalar depending on . Subtracting from then gives the claim.

Similar results and arguments also hold for the and norms, which we will not detail here.

Dimensional analysis reveals that the norm is not actually the most natural norm with which to compare the norms against. An application of Young’s convolution inequality in fact reveals that one has the inequality

where is the critical exponent , without any compactness or normalisation hypothesis on the group and the Haar measure . This allows us to extend the norm to all of . There is then a stronger inverse theorem available:

Theorem 2 (-near extremisers)Let be a locally compact abelian group, and let be such that and for some and . Then there exists a coset of a compact open subgroup of , and a polynomial of degree at most such that .

Conversely, it is not difficult to show that equality in (2) is attained when takes the form as above. The main idea of proof is to use an inverse theorem for Young’s inequality due to Fournier to reduce matters to the case that was already established. An analogous result is also obtained for the norm on an ergodic system; but for technical reasons, the methods do not seem to apply easily to the norm. (This norm is essentially equivalent to the norm up to constants, with comparable to , but when working with near-extremisers, norms that are only equivalent up to constants can have quite different near-extremal behaviour.)

In the case when is a Euclidean group , it is possible to use the sharp Young inequality of Beckner and of Brascamp-Lieb to improve (2) somewhat. For instance, when , one has

with equality attained if and only if is a gaussian modulated by a quadratic polynomial phase. This additional gain of allows one to pinpoint the threshold for the previous near-extremiser results in the case of norms. For instance, by using the Host-Kra machinery of characteristic factors for the norm, combined with an explicit and concrete analysis of the -step nilsystems generated by that machinery, we can show that

whenever is a *totally* ergodic system and is orthogonal to all linear and quadratic eigenfunctions (which would otherwise form immediate counterexamples to the above inequality), with the factor being best possible. We can also establish analogous results for the and norms (using the inverse theorem of Ben Green and myself, in place of the Host-Kra machinery), although it is not clear to us whether the threshold remains best possible in this case.

In Notes 5, we saw that the Gowers uniformity norms on vector spaces in high characteristic were controlled by classical polynomial phases .

Now we study the analogous situation on cyclic groups . Here, there is an unexpected surprise: the polynomial phases (classical or otherwise) are no longer sufficient to control the Gowers norms once exceeds . To resolve this problem, one must enlarge the space of polynomials to a larger class. It turns out that there are at least three closely related options for this class: the *local polynomials*, the *bracket polynomials*, and the *nilsequences*. Each of the three classes has its own strengths and weaknesses, but in my opinion the nilsequences seem to be the most natural class, due to the rich algebraic and dynamical structure coming from the nilpotent Lie group undergirding such sequences. For reasons of space we shall focus primarily on the nilsequence viewpoint here.

Traditionally, nilsequences have been defined in terms of linear orbits on nilmanifolds ; however, in recent years it has been realised that it is convenient for technical reasons (particularly for the quantitative “single-scale” theory) to generalise this setup to that of *polynomial* orbits , and this is the perspective we will take here.

A polynomial phase on a finite abelian group is formed by starting with a polynomial to the unit circle, and then composing it with the exponential function . To create a nilsequence , we generalise this construction by starting with a polynomial into a *nilmanifold* , and then composing this with a Lipschitz function . (The Lipschitz regularity class is convenient for minor technical reasons, but one could also use other regularity classes here if desired.) These classes of sequences certainly include the polynomial phases, but are somewhat more general; for instance, they *almost* include *bracket polynomial* phases such as . (The “almost” here is because the relevant functions involved are only piecewise Lipschitz rather than Lipschitz, but this is primarily a technical issue and one should view bracket polynomial phases as “morally” being nilsequences.)

In these notes we set out the basic theory for these nilsequences, including their equidistribution theory (which generalises the equidistribution theory of polynomial flows on tori from Notes 1) and show that they are indeed obstructions to the Gowers norm being small. This leads to the *inverse conjecture for the Gowers norms* that shows that the Gowers norms on cyclic groups are indeed controlled by these sequences.

In Notes 3, we saw that the number of additive patterns in a given set was (in principle, at least) controlled by *the Gowers uniformity norms* of functions associated to that set.

Such norms can be defined on any finite additive group (and also on some other types of domains, though we will not discuss this point here). In particular, they can be defined on the finite-dimensional vector spaces over a finite field .

In this case, the Gowers norms are closely tied to the space of polynomials of degree at most . Indeed, as noted in Exercise 20 of Notes 4, a function of norm has norm equal to if and only if for some ; thus polynomials solve the “ inverse problem” for the trivial inequality . They are also a crucial component of the solution to the “ inverse problem” and “ inverse problem”. For the former, we will soon show:

Proposition 1 ( inverse theorem for )Let be such that and for some . Then there exists such that , where is a constant depending only on .

Thus, for the Gowers norm to be almost completely saturated, one must be very close to a polynomial. The converse assertion is easily established:

Exercise 1 (Converse to inverse theorem for )If and for some , then , where is a constant depending only on .

In the world, one no longer expects to be close to a polynomial. Instead, one expects to *correlate* with a polynomial. Indeed, one has

Lemma 2 (Converse to the inverse theorem for )If and are such that , where , then .

*Proof:* From the definition of the norm (equation (18) from Notes 3), the monotonicity of the Gowers norms (Exercise 19 of Notes 3), and the polynomial phase modulation invariance of the Gowers norms (Exercise 21 of Notes 3), one has

and the claim follows.

In the high characteristic case at least, this can be reversed:

Theorem 3 ( inverse theorem for )Suppose that . If is such that and , then there exists such that .

This result is sometimes referred to as the *inverse conjecture for the Gowers norm* (in high, but bounded, characteristic). For small , the claim is easy:

Exercise 2Verify the cases of this theorem. (Hint:to verify the case, use the Fourier-analytic identities and , where is the space of all homomorphisms from to , and are the Fourier coefficients of .)

This conjecture for larger values of are more difficult to establish. The case of the theorem was established by Ben Green and myself in the high characteristic case ; the low characteristic case was independently and simultaneously established by Samorodnitsky. The cases in the high characteristic case was established in two stages, firstly using a modification of the Furstenberg correspondence principle, due to Ziegler and myself. to convert the problem to an ergodic theory counterpart, and then using a modification of the methods of Host-Kra and Ziegler to solve that counterpart, as done in this paper of Bergelson, Ziegler, and myself.

The situation with the low characteristic case in general is still unclear. In the high characteristic case, we saw from Notes 4 that one could replace the space of non-classical polynomials in the above conjecture with the essentially equivalent space of classical polynomials . However, as we shall see below, this turns out not to be the case in certain low characteristic cases (a fact first observed by Lovett, Meshulam, and Samorodnitsky, and independently by Ben Green and myself), for instance if and ; this is ultimately due to the existence in those cases of non-classical polynomials which exhibit no significant correlation with classical polynomials of equal or lesser degree. This distinction between classical and non-classical polynomials appears to be a rather non-trivial obstruction to understanding the low characteristic setting; it may be necessary to obtain a more complete theory of non-classical polynomials in order to fully settle this issue.

The inverse conjecture has a number of consequences. For instance, it can be used to establish the analogue of Szemerédi’s theorem in this setting:

Theorem 4 (Szemerédi’s theorem for finite fields)Let be a finite field, let , and let be such that . If is sufficiently large depending on , then contains an (affine) line for some with .

Exercise 3Use Theorem 4 to establish the following generalisation: with the notation as above, if and is sufficiently large depending on , then contains an affine -dimensional subspace.

We will prove this theorem in two different ways, one using a density increment method, and the other using an energy increment method. We discuss some other applications below the fold.

A (complex, semi-definite) inner product space is a complex vector space equipped with a sesquilinear form which is conjugate symmetric, in the sense that for all , and non-negative in the sense that for all . By inspecting the non-negativity of for complex numbers , one obtains the Cauchy-Schwarz inequality

if one then defines , one then quickly concludes the triangle inequality

which then soon implies that is a semi-norm on . If we make the additional assumption that the inner product is positive definite, i.e. that whenever is non-zero, then this semi-norm becomes a norm. If is complete with respect to the metric induced by this norm, then is called a Hilbert space.

The above material is extremely standard, and can be found in any graduate real analysis course; I myself covered it here. But what is perhaps less well known (except inside the fields of additive combinatorics and ergodic theory) is that the above theory of classical Hilbert spaces is just the first case of a hierarchy of *higher order Hilbert spaces*, in which the binary inner product is replaced with a -ary inner product that obeys an appropriate generalisation of the conjugate symmetry, sesquilinearity, and positive semi-definiteness axioms. Such inner products then obey a higher order Cauchy-Schwarz inequality, known as the *Cauchy-Schwarz-Gowers* inequality, and then also obey a triangle inequality and become semi-norms (or norms, if the inner product was non-degenerate). Examples of such norms and spaces include the *Gowers uniformity norms* , the *Gowers box norms* , and the *Gowers-Host-Kra seminorms* ; a more elementary example are the family of Lebesgue spaces when the exponent is a power of two. They play a central role in modern additive combinatorics and to certain aspects of ergodic theory, particularly those relating to Szemerédi’s theorem (or its ergodic counterpart, the Furstenberg multiple recurrence theorem); they also arise in the regularity theory of hypergraphs (which is not unrelated to the other two topics).

A simple example to keep in mind here is the order two Hilbert space on a measure space , where the inner product takes the form

In this brief note I would like to set out the abstract theory of such higher order Hilbert spaces. This is not new material, being already implicit in the breakthrough papers of Gowers and Host-Kra, but I just wanted to emphasise the fact that the material is abstract, and is not particularly tied to any explicit choice of norm so long as a certain axiom are satisfied. (Also, I wanted to write things down so that I would not have to reconstruct this formalism again in the future.) Unfortunately, the notation is quite heavy and the abstract axiom is a little strange; it may be that there is a better way to formulate things. In this particular case it does seem that a concrete approach is significantly clearer, but abstraction is at least possible.

Note: the discussion below is likely to be comprehensible only to readers who already have some exposure to the Gowers norms.

In the previous lecture notes, we used (linear) Fourier analysis to control the number of three-term arithmetic progressions in a given set . The power of the Fourier transform for this problem ultimately stemmed from the identity

for any cyclic group and any subset of that group (analogues of this identity also exist for other finite abelian groups, and to a lesser extent to non-abelian groups also, although that is not the focus of my current discussion). As it turns out, linear Fourier analysis is not able to discern higher order patterns, such as arithmetic progressions of length four; we give some demonstrations of this below the fold, taking advantage of the polynomial recurrence theory from Notes 1.

The main objective of this course is to introduce the (still nascent) theory of *higher order Fourier analysis*, which is capable of studying higher order patterns. The full theory is still rather complicated (at least, at our present level of understanding). However, one aspect of the theory is relatively simple, namely that we can largely reduce the study of arbitrary additive patterns to the study of a single type of additive pattern, namely the *parallelopipeds*

Thus for instance, for one has the *line segments*

for one has the *parallelograms*

for one has the *parallelopipeds*

These patterns are particularly pleasant to handle, thanks to the large number of symmetries available on the discrete cube . For instance, whereas establishing the presence of arbitrarily long arithmetic progressions in dense sets is quite difficult (Szemerédi’s theorem), establishing arbitrarily high-dimensional parallelopipeds is much easier:

Exercise 1Let be such that for some . If is sufficiently large depending on , show that there exists an integer such that . (Hint:obtain upper and lower bounds on the set .)

Exercise 2 (Hilbert cube lemma)Let be such that for some , and let be an integer. Show that if is sufficiently large depending on , then contains a parallelopiped of the form (2), with positive integers. (Hint:use the previous exercise and induction.) Conclude that if has positive upper density, then it contains infinitely many such parallelopipeds for each .

Exercise 3Show that if is an integer, and is sufficiently large depending on , then for any parallelopiped (2) in the integers , there exists , not all zero, such that . (Hint: pigeonhole the in the residue classes modulo .) Use this to conclude that if is the set of all integers such that for all integers , then is a set of positive upper density (and also positive lower density) which does not contain any infinite parallelopipeds (thus one cannot take in the Hilbert cube lemma).

The standard way to control the parallelogram patterns (and thus, all other (finite complexity) linear patterns) are the *Gowers uniformity norms*

with a function on a finite abelian group , and is the complex conjugation operator; analogues of this norm also exist for group-like objects such as the progression , and also for measure-preserving systems (where they are known as the *Gowers-Host-Kra uniformity seminorms*, see this paper of Host-Kra for more discussion). In this set of notes we will focus on the basic properties of these norms; the deepest fact about them, known as the *inverse conjecture* for these norms, will be discussed in later notes.

Ben Green, and I have just uploaded to the arXiv a short (six-page) paper “Yet another proof of Szemeredi’s theorem“, submitted to the 70th birthday conference proceedings for Endre Szemerédi. In this paper we put in print a folklore observation, namely that the inverse conjecture for the Gowers norm, together with the density increment argument, easily implies Szemerédi’s famous theorem on arithmetic progressions. This is unsurprising, given that Gowers’ proof of Szemerédi’s theorem proceeds through a weaker version of the inverse conjecture and a density increment argument, and also given that it is possible to derive Szemerédi’s theorem from knowledge of the characteristic factor for multiple recurrence (the ergodic theory analogue of the inverse conjecture, first established by Host and Kra), as was done by Bergelson, Leibman, and Lesigne (and also implicitly in the earlier paper of Bergelson, Host, and Kra); but to our knowledge the exact derivation of Szemerédi’s theorem from the inverse conjecture was not in the literature. Ordinarily this type of folklore might be considered too trifling (and too well known among experts in the field) to publish; but we felt that the venue of the Szemerédi birthday conference provided a natural venue for this particular observation.

The key point is that one can show (by an elementary argument relying primarily an induction on dimension argument and the Weyl recurrence theorem, i.e. that given any real and any integer , that the expression gets arbitrarily close to an integer) that given a (polynomial) nilsequence , one can subdivide any long arithmetic progression (such as ) into a number of medium-sized progressions, where the nilsequence is nearly constant on each progression. As a consequence of this and the inverse conjecture for the Gowers norm, if a set has no arithmetic progressions, then it must have an elevated density on a subprogression; iterating this observation as per the usual density-increment argument as introduced long ago by Roth, one obtains the claim. (This is very close to the scheme of Gowers’ proof.)

Technically, one might call this the shortest proof of Szemerédi’s theorem in the literature (and would be something like the sixteenth such genuinely distinct proof, by our count), but that would be cheating quite a bit, primarily due to the fact that it assumes the inverse conjecture for the Gowers norm, our current proof of which is checking in at about 100 pages…

## Recent Comments