As mentioned in the previous two posts, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms)Let and be integers, and let . Suppose that is a function supported on such thatThen there exists a filtered nilmanifold of degree and complexity , a polynomial sequence , and a Lipschitz function of Lipschitz constant such that

There is a higher dimensional generalisation, which first appeared explicitly (in a more general form) in this preprint of Szegedy (which used a slightly different argument than the one of Ben, Tammy, and myself; see also this previous preprint of Szegedy with related results):

Theorem 2 (Inverse theorem for multidimensional Gowers norms)Let and be integers, and let . Suppose that is a function supported on such thatThen there exists a filtered nilmanifold of degree and complexity , a polynomial sequence , and a Lipschitz function of Lipschitz constant such that

The case of this theorem was recently used by Wenbo Sun. One can replace the polynomial sequence with a linear sequence if desired by using a lifting trick (essentially due to Furstenberg, but which appears explicitly in Appendix C of my paper with Ben and Tammy).

In this post I would like to record a very neat and simple observation of Ben Green and Nikos Frantzikinakis, that uses the tool of Freiman isomorphisms to derive Theorem 2 as a corollary. Namely, consider the linear map defined by

that is to say is the digit string base that has digits . This map is a linear map from to a subset of of density . Furthermore it has the following “Freiman isomorphism” property: if lie in with in the image set of for all , then there exist (unique) lifts such that

and

for all . Indeed, the injectivity of on uniquely determines the sum for each , and one can use base arithmetic to verify that the alternating sum of these sums on any -facet of the cube vanishes, which gives the claim. (In the language of additive combinatorics, the point is that is a Freiman isomorphism of order (say) on .)

Now let be the function defined by setting whenever , with vanishing outside of . If obeys (1), then from the above Freiman isomorphism property we have

Applying the one-dimensional inverse theorem (Theorem 1), with reduced by a factor of and replaced by , this implies the existence of a filtered nilmanifold of degree and complexity , a polynomial sequence , and a Lipschitz function of Lipschitz constant such that

which by the Freiman isomorphism property again implies that

But the map is clearly a polynomial map from to (the composition of two polynomial maps is polynomial, see e.g. Appendix B of my paper with Ben and Tammy), and the claim follows.

Remark 3This trick appears to be largely restricted to the case of boundedly generated groups such as ; I do not see any easy way to deduce an inverse theorem for, say, from the -inverse theorem by this method.

Remark 4By combining this argument with the one in the previous post, one can obtain a weak ergodic inverse theorem for -actions. Interestingly, the Freiman isomorphism argument appears to be difficult to implement directly in the ergodic category; in particular, there does not appear to be an obvious direct way to derive the Host-Kra inverse theorem for actions (a result first obtained in the PhD thesis of Griesmer) from the counterpart for actions.

Filed under: expository, math.CO Tagged: Ben Green, Freiman isomorphism, Gowers uniformity norms, Nikos Frantzikinakis ]]>

As mentioned in the previous post, Ben Green, Tamar Ziegler, and myself proved the following inverse theorem for the Gowers norms:

Theorem 1 (Inverse theorem for Gowers norms)Let and be integers, and let . Suppose that is a function supported on such thatThen there exists a filtered nilmanifold of degree and complexity , a polynomial sequence , and a Lipschitz function of Lipschitz constant such that

This result was conjectured earlier by Ben Green and myself; this conjecture was strongly motivated by an analogous inverse theorem in ergodic theory by Host and Kra, which we formulate here in a form designed to resemble Theorem 1 as closely as possible:

Theorem 2 (Inverse theorem for Gowers-Host-Kra seminorms)Let be an integer, and let be an ergodic, countably generated measure-preserving system. Suppose that one hasfor all non-zero (all spaces are real-valued in this post). Then is an inverse limit (in the category of measure-preserving systems, up to almost everywhere equivalence) of ergodic degree nilsystems, that is to say systems of the form for some degree filtered nilmanifold and a group element that acts ergodically on .

It is a natural question to ask if there is any logical relationship between the two theorems. In the finite field category, one can deduce the combinatorial inverse theorem from the ergodic inverse theorem by a variant of the Furstenberg correspondence principle, as worked out by Tamar Ziegler and myself, however in the current context of -actions, the connection is less clear.

One can split Theorem 2 into two components:

Theorem 3 (Weak inverse theorem for Gowers-Host-Kra seminorms)Let be an integer, and let be an ergodic, countably generated measure-preserving system. Suppose that one hasfor all non-zero , where . Then is a

factorof an inverse limit of ergodic degree nilsystems.

Theorem 4 (Pro-nilsystems closed under factors)Let be an integer. Then any factor of an inverse limit of ergodic degree nilsystems, is again an inverse limit of ergodic degree nilsystems.

Indeed, it is clear that Theorem 2 implies both Theorem 3 and Theorem 4, and conversely that the two latter theorems jointly imply the former. Theorem 4 is, in principle, purely a fact about nilsystems, and should have an independent proof, but this is not known; the only known proofs go through the full machinery needed to prove Theorem 2 (or the closely related theorem of Ziegler). (However, the fact that a factor of a nilsystem is again a nilsystem was established previously by Parry.)

The purpose of this post is to record a partial implication in reverse direction to the correspondence principle:

As mentioned at the start of the post, a fair amount of familiarity with the area is presumed here, and some routine steps will be presented with only a fairly brief explanation.

To show that is a factor of another system up to almost everywhere equivalence, it suffices to obtain a unital algebra homomorphism from to that intertwines with , and which is measure-preserving (or more precisely, integral-preserving). On the other hand, by hypothesis, is generated (as a von Neumann algebra) by the dual functions

for , where

indeed we may restrict to a countable sequence that is dense in in the (say) topology, together with their shifts. To obtain such a factor representation, it thus suffices to find a “model” associated to each dual function in such a fashion that

for all and , and all polynomials . Of course it suffices to do so for those polynomials with rational coefficients (so now there are only a countable number of constraints to consider).

We may normalise all the to take values in . For any , we can find a scale such that

If we then define the exceptional set

then has measure at most (say), and so the function is absolutely integrable. By the maximal ergodic theorem, we thus see that for almost every , there exists a finite such that

for all and all . Informally, we thus have the approximation

for “most” .

Next, we observe from the Cauchy-Schwarz-Gowers inequality that for almost every , the dual function is anti-uniform in the sense that

for any function . By the usual structure theorems (e.g. Theorem 1.2 of this paper of Ben Green and myself) this shows that for almost every and every , there exists a degree nilsequence of complexity such that

(say). (Sketch of proof: standard structure theorems give a decomposition of the form

where is a nilsequence as above, is small in norm, and is very small in norm; has small inner product with , , and , and thus with itself, and so and are both small in , giving the claim.)

For each , let denote the set of all such that there exists a degree nilsequence (depending on ) of complexity such that

From the Hardy-Littlewood maximal inequality (and the measure-preserving nature of ) we see that has measure . This implies that the functions

are uniformly bounded in as , which by Fatou’s lemma implies that

is also absolutely integrable. In particular, for almost every , we have

for some finite , which implies that

for an infinite sequence of (the exact choice of sequence depends on ); in particular, there is a such that for all in this sequence, one has

for all and all . Thus

for all in this sequence, all , and all ; combining with (2) we see (for almost every ) that

and thus for all , all , and all we have

where the limit is along the sequence.

For given , there are only finitely many possibilities for the nilmanifold , so by the usual diagonalisation argument we may pass to a subsequence of and assume that does not depend on for any . By Arzela-Ascoli we may similarly assume that the Lipschitz function converges uniformly to , so we now have

along the remaining subsequence for all , all , and all .

By repeatedly breaking the coefficients of the polynomial sequence into fractional parts and integer parts, and absorbing the latter in , we may assume that these coefficients are bounded. Thus, by Bolzano-Weierstrass and refining the sequence of further, we may assume that converges locally uniformly in as goes to infinity to a polynomial sequence , for every . We thus have (for almost every ) that

for all , all , and all . Henceforth we shall cease to keep control of the complexity of or .

We can lift the polynomial sequence up to a linear sequence (enlarging as necessary), thus

for all , all , and some . By replacing various nilsystems with Cartesian powers, we may assume that the nilsystems are increasing in and in the sense that the nilsystem for is a factor of that for or , with the origin mapping to the origin. Then, by restricting to the orbit of the origin, we may assume that all the nilsystems are ergodic (and thus also uniquely ergodic, by the special properties of nilsystems). The nilsystems then have an ergodic inverse limit with an origin , and each function lifts up to a continuous function on , with . Thus

From the triangle inequality we see in particular that

for all and all , which by unique ergodicity of the nilsystems implies that

Thus the sequence is Cauchy in and tends to a some limit .

If is generic for (which is true for almost every ), we conclude from (4) and unique ergodicity of nilsystems that

for , which on taking limits as gives

A similar argument gives (1) for almost every , for each choice of . Since one only needs to verify a countable number of these conditions, we can find an for which all the (1) hold simultaneously, and the claim follows.

Remark 6In order to use the combinatorial inverse theorem to prove the full ergodic inverse theorem (and not just the weak version), it appears that one needs an “algorithmic” or “measurable” version of the combinatorial inverse theorem, in which the nilsequence produced by the inverse theorem can be generated in a suitable “algorithmic” sense from the original function . In the setting of the inverse theorem over finite fields, a result in this direction was established by Tulsiani and Wolf (building upon a well-known paper of Goldreich and Levin handling the case). It is thus reasonable to expect that a similarly algorithmic version of the combinatorial inverse conjecture is true for higher Gowers uniformity norms, though this has not yet been achieved in the literature to my knowledge.

Filed under: expository, math.CO, math.DS Tagged: characteristic factor, Gowers uniformity norms, inverse conjecture, nilmanifolds, nilsequences ]]>

Theorem 1 (Discrete inverse theorem for Gowers norms)Let and be integers, and let . Suppose that is a function supported on such that

For the definitions of “filtered nilmanifold”, “degree”, “complexity”, and “polynomial sequence”, see the paper of Ben, Tammy, and myself. (I should caution the reader that this blog post will presume a fair amount of familiarity with this subfield of additive combinatorics.) This result has a number of applications, for instance to establishing asymptotics for linear equations in the primes, but this will not be the focus of discussion here.

The purpose of this post is to record the observation that this “discrete” inverse theorem, together with an equidistribution theorem for nilsequences that Ben and I worked out in a separate paper, implies a continuous version:

Theorem 2 (Continuous inverse theorem for Gowers norms)Let be an integer, and let . Suppose that is a measurable function supported on such thatThen there exists a filtered nilmanifold of degree and complexity , a (smooth) polynomial sequence , and a Lipschitz function of Lipschitz constant such that

The interval can be easily replaced with any other fixed interval by a change of variables. A key point here is that the bounds are completely uniform in the choice of . Note though that the coefficients of can be arbitrarily large (and this is necessary, as can be seen just by considering functions of the form for some arbitrarily large frequency ).

It is likely that one could prove Theorem 2 by carefully going through the proof of Theorem 1 and replacing all instances of with (and making appropriate modifications to the argument to accommodate this). However, the proof of Theorem 1 is quite lengthy. Here, we shall proceed by the usual limiting process of viewing the continuous interval as a limit of the discrete interval as . However there will be some problems taking the limit due to a failure of compactness, and specifically with regards to the coefficients of the polynomial sequence produced by Theorem 1, after normalising these coefficients by . Fortunately, a factorisation theorem from a paper of Ben Green and myself resolves this problem by splitting into a “smooth” part which does enjoy good compactness properties, as well as “totally equidistributed” and “periodic” parts which can be eliminated using the measurability (and thus, approximate smoothness), of .

We now prove Theorem 2. Firstly observe from Hölder’s inequality that the Gowers norm expression in the left-hand side of (1) is continuous in in the topology. As such, it suffices to prove the theorem for a dense class of , such as the Lipschitz-continuous , so long as the bounds remain uniform in . Thus, we may assume without loss of generality that is Lipschitz continuous.

Now let be a large integer (which will eventually be sent to infinity along a subsequence). As is Lipschitz continuous, the integral in (1) is certainly Riemann integrable, and so for sufficiently large (where we allow “sufficiently large” to depend on the Lipschitz constant) we will have

(say). Applying Theorem 2, we can thus find for sufficiently large , a filtered nilmanifold of degree and complexity , a polynomial sequence , and a Lipschitz function of Lipschitz constant such that

Now we prepare to take limits as , passing to subsequences as necessary. Using Mal’cev bases, one can easily check that there are only finitely many filtered nilmanifolds of a fixed degree and complexity, hence by passing to a subsequence of we may assume that is independent of . The Lipschitz functions are now equicontinuous on a fixed compact domain , so by the Arzelá-Ascoli theorem and further passage to a subsequence we may assume that converges uniformly to a Lipschitz function of Lipschitz constant . In particular (passing to a further subsequence as necessary) we have

We have removed a lot of the dependencies of the nilsequence on , however there is still a serious lack of compactness in the remaining dependency of the polynomial sequence on . Fortunately, we can argue as follows. Let be large quantities (depending on , and the Lipschitz constant of ) to be chosen later. Applying the factorisation theorem for polynomial sequences (see Theorem 1.19 of this paper of Ben Green and myself), we may find for each in the current subsequence, an integer , a rational subgroup of whose associated filtered nilmanifold has structure constants that are -rational with respect to the Mal’cev bais of , and a decomposition

where

- is a polynomial sequence which is -smooth;
- is a polynomial sequence with is totally -equidistributed in ; and
- is a polynomial sequence which is -rational, and is periodic with period at most .

See the above referenced paper for a definition of all the terminology used here.

Once again we can make a lot of the data here independent of by passing to a subsequence. Firstly, takes only finitely many values so by passing to a subsequence we may assume that is independent of . Then the number of rational subgroups with -rational structure constants is also finite, so by passing to a further subsequence we may take independent of , so is also independent of . Up to right multiplication by polynomial sequences from to (which do not affect the value of ), there are only finitely many -rational polynomial sequences that are periodic with period at most , so we may take independent of . Finally, using coordinates one can write where is a continuous polynomial sequence whose coefficients are bounded uniformly in . By Bolzano-Weierstrass, we may assume on passing to a subsequence that converges locally uniformly to a limit , which is again a continuous polynomial sequence. Thus, on passing to a further subsequence, we have

Let be the period of . By the pigeonhole principle (and again passing to a subsequence) we may find a residue class independent of such that

Because is totally -equidistributed in , and is -rational, the conjugate is totally -equidistributed in , where and ; see Section 2 of this paper of Ben and myself for a derivation of this fact. From this, we have the approximation

for any and any fixed interval , where is the length of and the integral is with respect to Haar measure, and are sufficiently large depending on . Using Riemann integration, we thus see that the left-hand side of (2) is thus of the form

for sufficiently large if are sufficiently large depending on , and the Lipschitz constant of , and so (writing ) we have

if are sufficiently large depending on , and the Lipschitz constant of . If we let be a continuous polynomial sequence that is equidistributed in (which will happen as soon as the sequence is equidistributed with respect to the abelianisation of , by an old result of Leon Green), then a similar argument shows that

and thus there exists such that

Setting , we obtain the claim.

I thank Ben Green for helpful conversations that inspired this post.

Filed under: expository, math.CO, math.DS Tagged: Ben Green, Gowers uniformity norms, inverse conjecture, regularity lemma, Tamar Ziegler ]]>

Theorem 1 (Szemerédi’s theorem)Let be a positive integer, and let be a function with for some , where we use the averaging notation , , etc.. Then for we havefor some depending only on .

The equivalence is basically thanks to an averaging argument of Varnavides; see for instance Chapter 11 of my book with Van Vu or this previous blog post for a discussion. We have removed the cases as they are trivial and somewhat degenerate.

There are now many proofs of this theorem. Some time ago, I took an ergodic-theoretic proof of Furstenberg and converted it to a purely finitary proof of the theorem. The argument used some simplifying innovations that had been developed since the original work of Furstenberg (in particular, deployment of the Gowers uniformity norms, as well as a “dual” norm that I called the uniformly almost periodic norm, and an emphasis on van der Waerden’s theorem for handling the “compact extension” component of the argument). But the proof was still quite messy. However, as discussed in this previous blog post, messy finitary proofs can often be cleaned up using nonstandard analysis. Thus, there should be a nonstandard version of the Furstenberg ergodic theory argument that is relatively clean. I decided (after some encouragement from Ben Green and Isaac Goldbring) to write down most of the details of this argument in this blog post, though for sake of brevity I will skim rather quickly over arguments that were already discussed at length in other blog posts. In particular, I will presume familiarity with nonstandard analysis (in particular, the notion of a standard part of a bounded real number, and the Loeb measure construction), see for instance this previous blog post for a discussion.

By routine “compactness and contradiction” arguments (as discussed in this previous post), Theorem 1 can be deduced from the following nonstandard variant:

Theorem 2Let be a nonstandard positive integer, let be the nonstandard cyclic group , and let be an internal function with . Then for any standard ,Here of course the averaging notation is interpreted internally.

Indeed, if Theorem 1 failed, one could create a sequence of functions of density at least for some fixed , and a fixed such that

taking ultralimits one can then soon obtain a counterexample to Theorem 2.

It remains to prove Theorem 2. Henceforth is a fixed nonstandard positive integer, and . By the Loeb measure construction (discussed in this previous blog post), one can give the structure of a probability space (the *Loeb space* of ), such that every internal subset of is (Loeb) measurable with

which implies that any bounded internal function has standard part which is (Loeb) measurable with

Conversely, a countable saturation argument shows that any function in is equal almost everywhere to the standard part of a bounded internal function.

From Hölder’s inequality we see that the -linear form

vanishes if one of the has standard part vanishing almost everywhere. As such, we can (by abuse of notation) extend this -linear form to functions that are elements of , rather than bounded internal functions. With this convention, we see that Theorem 2 is equivalent to the following assertion.

Theorem 3For any non-negative with , one has for any standard ,

The next step is to introduce the *Gowers-Host-Kra uniformity seminorms* , defined for by the formula

where is any bounded internal function whose standard part equals almost everywhere. From Hölder’s inequality one can see that the exact choice of does not matter, so that this seminorm is well-defined. (It is indeed a seminorm, but we will not need this fact here.)

We have the following application of the van der Corput inequality:

Theorem 4 (Generalised von Neumann theorem)Let be standard. For any with for some , one has

This estimate is proven in numerous places in the literature (e.g. Lemma 11.4 of my book with Van Vu, or Exercise 23 of this blog post) and will not be repeated here. In particular, from multilinearity we see that

Dual to the Gowers norms are the uniformly almost periodic norms . Let us first define the internal version of these norms. We define to be the space of constant internal functions , with internal norm . Once is defined for some , we define to be the internal normed vector space of internal functions for which there exists a nonstandard real number , an internally finite non-empty set , an internal family of internal functions bounded in magnitude by one for each , and an internal family of internal functions in the unit ball of such that one had the representation

for all , where is the shift of by . The internal infimum of all such is then the norm of . This gives each of the the structure of an internal shift-invariant Banach algebra; see Section 5 of . The norms also controlled the supremum norm:

In particular, if we write for the space of standard parts of internal functions of bounded norm in , then is an (external) Banach algebra contained (as a real vector space) in . For , we can then define a factor of to be the probability space , where is the subalgebra of consisting of those sets such that lies in the closure of . This is easily seen to be a shift-invariant -algebra, and so is a factor.

We have the following key *characteristic factor* relationship:

Theorem 5Let with . Then .

In fact the converse implication is true also (making the *universal characteristic factor* for the seminorm), but we will not need this direction of the implication.

*Proof:* Suppose for contradiction that ; we can normalise . Writing for some bounded internal , we then see that has a non-zero inner product with , where the *dual function* for is the bounded internal function

From the easily verified identity

and a routine induction, we see that lies in the unit ball of , and so is measurable with respect to . By hypothesis this implies that is orthogonal to , a contradiction.

In view of the above theorem and (1), we may replace by without affecting the average in Theorem 3. Thus that theorem is equivalent to the following.

Theorem 6Let and be standard. Then for any non-negative with , one has

We only apply this theorem in the case and , but for inductive purposes it is convenient to decouple the two parameters.

We prove Theorem 6 by induction on (allowing to be arbitrary). When , the claim is obvious for any because all functions in are essentially constant. Now suppose that and that the claim has already been proven for .

Let be a nonnegative function whose mean is positive; we may normalise to take values in . Let be standard, and let be a sufficiently small standard quantity depending on to be chosen later (one could for instance take , but we will not attempt to optimise in ). As is -measurable, one can find an internal function with and bounded norm such that . (Note though that while the norm of is bounded, this bound could be extremely large compared to , , .)

Set . We define the relative inner product for by the formula

and the relative norm

This gives the structure of a (pre-)Hilbert module over , as discussed in this previous blog post.

A crucial point is that the function is *relatively almost periodic* over the previous characteristic factor , in the following sense.

Proposition 7 (Relative almost periodicity)There exists a standard natural number and functions in the unit ball of with the following “relative total boundedness” property: for any , there exists a -measurable function such that almost everywhere (where is short-hand for ).

*Proof:* This will be a relative version of the standard analysis fact that integral operators on finite measure spaces with bounded kernel are in the Hilbert-Schmidt class, and thus compact.

By construction, there exists an internally finite non-empty set , an internal collection of internal functions that are uniformly bounded in , and an internal collection of internal functions that are uniformly bounded in , such that

for all . Note in particular that the all lie in a bounded subset of , and the all lie in a bounded subset of .

We give the -algebra generated from the standard parts of bounded internal functions such that the standard parts of all lie in a bounded subset of ; this gives a probability space that extends the product measure of and . We define an operator as follows. If , then is the standard part of some bounded internal function . We then define by the formula

This can easily be seen to not depend on the choice of , and defines a -linear operator (embedding into both and in the obvious fashion). Note that lies in the range of applied to a function in the unit ball of .

Now we claim that this operator is *relatively Hilbert-Schmidt* over , in the sense that there exists a finite bound such that

for all finite collections of functions that are relatively orthonormal over in the sense that

and

for all and . Indeed, the left-hand side of (4) may be expanded first as

for some sequence in with , and then as

where we use Loeb measure on and is the function , and are lifted up to in the obvious fashion. By Cauchy-Schwarz and the boundedness of , we can bound this by

But the are relatively orthonormal over (this reflects the relative orthogonality of and over ), so that

and the claim follows from the hypotheses on .

Using the relative spectral theorem for relative Hilbert-Schmidt operators (see Corollary 17 of this blog post), we may thus find relatively orthonormal systems in and respectively over and a non-increasing sequence of non-negative coefficients (the relative singular values) with almost everywhere, such that we have the spectral decomposition

wiht the sum converging in . (If were standard Borel spaces, one could deduce this theorem from the usual spectral theorem for Hilbert-Schmidt operators using disintegration. Loeb spaces are certainly not standard Borel, but as discussed in the linked blog post above, one can adapt the *proof* of the spectral theorem to the relative setting without using the device of disintegration.

Since and the are decreasing, one can find an such that almost everywhere for all . For in the unit ball of , this lets one approximate by the finite rank operator to within almost everywhere in norm. If one rounds to the nearest multiple of for each , and lets be the collection of linear combinations of the form with a multiple of , we obtain the claim.

We return to the proof of (2). Since and , we have

if is small enough. In particular there is a -measurable set of measure at least such that on . Since

we see from Markov’s inequality (for small enough ) that there is a -measurable subset of of measure at least such that

for the relative norm. In particular we have

Let be a sufficiently large standard natural number (depending on and the quantity from Proposition 7), in fact it will essentially be a van der Waerden number of these inputs) to be chosen later. Applying the induction hypothesis, we have

In particular, there is a standard , such that for in a subset of of measure at least , we have

or equivalently that the set

has measure at least .

Let be as above, and let be the functions from Proposition 7. Then for , we can find a measurable function such that

almost everywhere on , hence by (5) we have

almost everywhere on . From this and the relative Hölder inequality, we see that

a.e. on whenever .

Now, for large enough, we see from van der Warden’s theorem that there exist measurable such that

almost everywhere in , and hence in (this can be seen by partitioning into finitely many pieces, with each of the constant on each of these pieces). For that choice of we have

and

and thus

almost everywhere on . But from (6) one has

a.e. on , so from Hölder’s inequality we have (for sufficiently small) that

From non-negativity of , this implies that

which on integrating in gives

Averaging in , we conclude that

Shifting by , we conclude that

Dilating by (and noting that the map is at most -to-one on ), we conclude that

and (2) follows.

Filed under: expository, math.CO, math.DS Tagged: Gowers uniformity norms, nonstandard analysis, Szemeredi's theorem, uniform almost periodicity ]]>

- Every positive integer has a prime factorisation
into (not necessarily distinct) primes , which is unique up to rearrangement. Taking logarithms, we obtain a partition

of .

- (Prime number theorem) A randomly selected integer of size will be prime with probability when is large.
- If is a randomly selected large integer of size , and is a randomly selected prime factor of (with each index being chosen with probability ), then is approximately uniformly distributed between and . (See Proposition 9 of this previous blog post.)
- The set of real numbers arising from the prime factorisation of a large random number converges (away from the origin, and in a suitable weak sense) to the Poisson-Dirichlet process in the limit . (See the previously mentioned blog post for a definition of the Poisson-Dirichlet process, and a proof of this claim.)

Now for the facts about the cycle decomposition of large permutations:

- Every permutation has a cycle decomposition
into disjoint cycles , which is unique up to rearrangement, and where we count each fixed point of as a cycle of length . If is the length of the cycle , we obtain a partition

of .

- (Prime number theorem for permutations) A randomly selected permutation of will be an -cycle with probability exactly . (This was noted in this previous blog post.)
- If is a random permutation in , and is a randomly selected cycle of (with each being selected with probability ), then is exactly uniformly distributed on . (See Proposition 8 of this blog post.)
- The set of real numbers arising from the cycle decomposition of a random permutation converges (in a suitable sense) to the Poisson-Dirichlet process in the limit . (Again, see this previous blog post for details.)

See this previous blog post (or the aforementioned article of Granville, or the Notices article of Arratia, Barbour, and Tavaré) for further exploration of the analogy between prime factorisation of integers and cycle decomposition of permutations.

There is however something unsatisfying about the analogy, in that it is not clear *why* there should be such a kinship between integer prime factorisation and permutation cycle decomposition. It turns out that the situation is clarified if one uses another fundamental analogy in number theory, namely the analogy between integers and polynomials over a finite field , discussed for instance in this previous post; this is the simplest case of the more general function field analogy between number fields and function fields. Just as we restrict attention to positive integers when talking about prime factorisation, it will be reasonable to restrict attention to monic polynomials . We then have another analogous list of facts, proven very similarly to the corresponding list of facts for the integers:

- Every monic polynomial has a factorisation
into irreducible monic polynomials , which is unique up to rearrangement. Taking degrees, we obtain a partition

of .

- (Prime number theorem for polynomials) A randomly selected monic polynomial of degree will be irreducible with probability when is fixed and is large.
- If is a random monic polynomial of degree , and is a random irreducible factor of (with each selected with probability ), then is approximately uniformly distributed in when is fixed and is large.
- The set of real numbers arising from the factorisation of a randomly selected polynomial of degree converges (in a suitable sense) to the Poisson-Dirichlet process when is fixed and is large.

The above list of facts addressed the *large limit* of the polynomial ring , where the order of the field is held fixed, but the degrees of the polynomials go to infinity. This is the limit that is most closely analogous to the integers . However, there is another interesting asymptotic limit of polynomial rings to consider, namely the *large limit* where it is now the *degree* that is held fixed, but the order of the field goes to infinity. Actually to simplify the exposition we will use the slightly more restrictive limit where the *characteristic* of the field goes to infinity (again keeping the degree fixed), although all of the results proven below for the large limit turn out to be true as well in the large limit.

The large (or large ) limit is technically a different limit than the large limit, but in practice the asymptotic statistics of the two limits often agree quite closely. For instance, here is the prime number theorem in the large limit:

Theorem 1 (Prime number theorem)The probability that a random monic polynomial of degree is irreducible is in the limit where is fixed and the characteristic goes to infinity.

*Proof:* There are monic polynomials of degree . If is irreducible, then the zeroes of are distinct and lie in the finite field , but do not lie in any proper subfield of that field. Conversely, every element of that does not lie in a proper subfield is the root of a unique monic polynomial in of degree (the minimal polynomial of ). Since the union of all the proper subfields of has size , the total number of irreducible polynomials of degree is thus , and the claim follows.

Remark 2The above argument and inclusion-exclusion in fact gives the well known exact formula for the number of irreducible monic polynomials of degree .

Now we can give a precise connection between the cycle distribution of a random permutation, and (the large limit of) the irreducible factorisation of a polynomial, giving a (somewhat indirect, but still connected) link between permutation cycle decomposition and integer factorisation:

Theorem 3The partition of a random monic polynomial of degree converges in distribution to the partition of a random permutation of length , in the limit where is fixed and the characteristic goes to infinity.

We can quickly prove this theorem as follows. We first need a basic fact:

Lemma 4 (Most polynomials square-free in large limit)A random monic polynomial of degree will be square-free with probability when is fixed and (or ) goes to infinity. In a similar spirit, two randomly selected monic polynomials of degree will be coprime with probability if are fixed and or goes to infinity.

*Proof:* For any polynomial of degree , the probability that is divisible by is at most . Summing over all polynomials of degree , and using the union bound, we see that the probability that is *not* squarefree is at most , giving the first claim. For the second, observe from the first claim (and the fact that has only a bounded number of factors) that is squarefree with probability , giving the claim.

Now we can prove the theorem. Elementary combinatorics tells us that the probability of a random permutation consisting of cycles of length for , where are nonnegative integers with , is precisely

since there are ways to write a given tuple of cycles in cycle notation in nondecreasing order of length, and ways to select the labels for the cycle notation. On the other hand, by Theorem 1 (and using Lemma 4 to isolate the small number of cases involving repeated factors) the number of monic polynomials of degree that are the product of irreducible polynomials of degree is

which simplifies to

and the claim follows.

This was a fairly short calculation, but it still doesn’t quite explain *why* there is such a link between the cycle decomposition of permutations and the factorisation of a polynomial. One immediate thought might be to try to link the multiplication structure of permutations in with the multiplication structure of polynomials; however, these structures are too dissimilar to set up a convincing analogy. For instance, the multiplication law on polynomials is abelian and non-invertible, whilst the multiplication law on is (extremely) non-abelian but invertible. Also, the multiplication of a degree and a degree polynomial is a degree polynomial, whereas the group multiplication law on permutations does not take a permutation in and a permutation in and return a permutation in .

I recently found (after some discussions with Ben Green) what I feel to be a satisfying conceptual (as opposed to computational) explanation of this link, which I will place below the fold.

To put cycle decomposition of permutations and factorisation of polynomials on an equal footing, we generalise the notion of a permutation to the notion of a *partial permutation* on a fixed (but possibly infinite) domain , which consists of a finite non-empty subset of the set , together with a bijection on ; I’ll call the *support* of the partial permutation. We say that a partial permutation is of *size* if the support is of cardinality , and denote this size as . And now we can introduce a multiplication law on partial permutations that is much closer to that of polynomials: if two partial permutations on the same domain have disjoint supports , then we can form their disjoint union , supported on , to be the bijection on that agrees with on and with on . Note that this is a commutative and associative operation (where it is defined), and is the disjoint union of a partial permutation of size and a partial permutation of size is a partial permutation of size , so this operation is much closer in behaviour to the multiplication law on polynomials than the group law on . There is the defect that the disjoint union operation is sometimes undefined (when the two partial permutations have overlapping support); but in the asymptotic regime where the size is fixed and the set is extremely large, this will be very rare (compare with Lemma 4).

Note that a partial permutation is irreducible with respect to disjoint union if and only if it is a cycle on its support, and every partial permutation has a decomposition into such partial cycles, unique up to permutations. If one then selects some set of partial cycles on the domain to serve as “generalised primes”, then one can define (in the spirit of Beurling integers) the set of “generalised integers”, defined as those partial permutations that are the disjoint union of partial cycles in . If one lets denote the set of generalised integers of size , one can (assuming that this set is non-empty and finite) select a partial permutation uniformly at random from , and consider the partition of arising from the decomposition into generalised primes.

We can now embed both the cycle decomposition for (complete) permutations and the factorisation of polynomials into this common framework. We begin with the cycle decomposition for permutations. Let be a large natural number, and set the domain to be the set . We define to be the set of *all* partial cycles on of size , and let be the union of the , that is to say the set of *all* partial cycles on (of arbitrary size). Then is of course the set of all partial permutations on , and is the set of all partial permutations on of size . To generate an element of uniformly at random for , one simply has to randomly select an -element subset of , and then form a random permutation of the elements of . From this, it is obvious that the partition of coming from a randomly chosen element of has exactly the same distribution as the partition of coming from a randomly chosen element of , as long as is at least as large as of course.

Now we embed the factorisation of polynomials into the same framework. The domain is now taken to be the algebraic closure of , or equivalently the direct limit of the finite fields (with the obvious inclusion maps). This domain has a fundamental bijection on it, the Frobenius map , which is a field automorphism that has as its fixed points. We define to be the set of partial permutations on formed by restricting the Frobenius map to a finite Frobenius-invariant set. It is easy to see that the irreducible Frobenius-invariant sets (that is to say, the orbits of ) arise from taking an element of together with all of its Galois conjugates, and so if we define to be the set of restrictions of Frobenius to a single such Galois orbit, then are the generalised integers to the generalised primes in the sense above. Next, observe that, when the characteristic is sufficiently large, every squarefree monic polynomial of degree generates a generalised integer of size , namely the restriction of the Frobenius map to the roots of (which will be necessarily distinct when the characteristic is large and is squarefree). This generalised integer will be a generalised prime precisely when is irreducible. Conversely, every generalised integer of size generates a squarefree monic polynomial in , namely the product of as ranges over the support of the integer. This product is clearly monic, squarefree, and Frobenius-invariant, thus it lies in . Thus we may identify with the monic squarefree polynomials of of degree . With this identification, the (now partially defined) multiplication operation on monic squarefree polynomials coincides exactly with the disjoint union operation on partial permutations. As such, we see that the partition associated to a randomly chosen squarefree monic polynomial of degree has exactly the same distribution as the partition associated to a randomly chosen generalised integer of size . By Lemma 4, one can drop the condition of being squarefree while only distorting the distribution by .

Now that we have placed cycle decomposition of permutations and factorisation of polynomials into the same framework, we can explain Theorem 3 as a consequence of the following *universality* result for generalised prime factorisations:

Theorem 5 (Universality)Let be collections of generalised primes and integers respectively on a domain , all of which depend on some asymptotic parameter that goes to infinity. Suppose that for any fixed and going to infinity, the sets are non-empty with cardinalities obeying the asymptoticAlso, suppose that only of the pairs have overlapping supports (informally, this means that is defined with probability ). Then, for fixed and going to infinity, the distribution of the partition of a random generalised integer from is universal in the limit; that is to say, the limiting distribution does not depend on the precise choice of .

Note that when consists of all the partial permutations of size on we have

while when consists of the monic squarefree polynomials of degree in then from Lemma 4 we also have

so in both cases the first hypothesis (1) is satisfied. The second hypothesis is easy to verify in the former case and follows from Lemma 4 in the latter case. Thus, Theorem 5 gives Theorem 3 as a corollary.

Remark 6An alternate way to interpret Theorem 3 is as an equidistribution theorem: if one randomly labels the zeroes of a random degree polynomial as , then the resulting permutation on induced by the Frobenius map is asymptotically equidistributed in the large (or large ) limit. This is the simplest case of a much more general (and deeper) result known as the Deligne equidistribution theorem, discussed for instance in this survey of Kowalski. See also this paper of Church, Ellenberg, and Farb concerning more precise asymptotics for the number of squarefree polynomials with a given cycle decomposition of Frobenius.

It remains to prove Theorem 5. The key is to establish an abstract form of the prime number theorem in this setting.

Theorem 7 (Prime number theorem)Let the hypotheses be as in Theorem 5. Then for fixed and , the density of in is . In particular, the asymptotic density is universal (it does not depend on the choice of ).

*Proof:* Let (this may only be defined for sufficiently large depending on ); our task is to show that for each fixed .

Consider the set of pairs where is an element of and is an element of the support of . Clearly, the number of such pairs is . On the other hand, given such a pair , there is a unique factorisation , where is the generalised prime in the decomposition of that contains in its support, and is formed from the remaining components of . has some size , has the complementary size and has disjoint support from , and has to be one of the elements of the support of . Conversely, if one selects , then selects a generalised prime , and a generalised integer with disjoint support from , and an element in the support of , we recover the pair . Using the hypotheses of Theorem 5, we thus obtain the double counting identity

and thus for every fixed , and so for fixed as claimed.

Remark 8One could cast this argument in a language more reminiscent of analytic number theory by forming generating series of and and treating these series as analogous to a zeta function and its log-derivative (in close analogy to what is done with Beurling primes), but we will not do so here.

We can now finish the proof of Theorem 5. To show asymptotic universality of the partition of a random generalised integer , we may assume inductively that asymptotic universality has already been shown for all smaller choices of . To generate a uniformly random generalised integer of size , we can repeat the process used to prove Theorem 7. It of course suffices to generate a uniformly random pair , where is a generalised integer of size and is an element of the support of , since on dropping we would obtain a uniformly drawn .

To obtain the pair , we first select uniformly at random, then select a generalised prime randomly from and a generalised integer randomly from (independently of once is fixed). Finally, we select uniformly at random from the support of , and set . The pair is certainly a pair of the required form, but this random variable is not quite uniformly distributed amongst all such pairs. However, by repeating the calculations in Theorem 5 (and in particular relying on the conclusion ), we see that this distribution is is within of the uniform distribution in total variation norm. Thus, the distribution of the cycle partition of a uniformly chosen lies within in total variation of the distribution of the cycle partition of a chosen by the above recipe. However, the cycle partition of is simply the union (with multiplicity) of with the cycle partition of . As the latter was already assumed to be asymptotically universal, we conclude that the former is also, as required.

Remark 9The above analysis helps explain why one could not easily link permutation cycle decomposition with integer factorisation – to produce permutations here with the right asymptotics we needed both the large limit and the Frobenius map, both of which are available in the function field setting but not in the number field setting.

Filed under: expository, math.CO, math.NT, math.PR Tagged: finite fields, permutations, prime number theorem ]]>

Let me discuss the former question first. Gromov’s theorem tells us that if a finite subset of a group exhibits polynomial growth in the sense that grows polynomially in , then the group generated by is virtually nilpotent (the converse direction also true, and is relatively easy to establish). This theorem has been strengthened a number of times over the years. For instance, a few years ago, I proved with Shalom that the condition that grew polynomially in could be replaced by for a *single* , as long as was sufficiently large depending on (in fact we gave a fairly explicit quantitative bound on how large needed to be). A little more recently, with Breuillard and Green, the condition was weakened to , that is to say it sufficed to have polynomial *relative* growth at a finite scale. In fact, the latter paper gave more information on in this case, roughly speaking it showed (at least in the case when was a symmetric neighbourhood of the identity) that was “commensurate” with a very structured object known as a *coset nilprogression*. This can then be used to establish further control on . For instance, it was recently shown by Breuillard and Tointon (again in the symmetric case) that if for a single that was sufficiently large depending on , then all the for have a doubling constant bounded by a bound depending only on , thus for all .

In this paper we are able to refine this analysis a bit further; under the same hypotheses, we can show an estimate of the form

for all and some piecewise linear, continuous, non-decreasing function with , where the error is bounded by a constant depending only on , and where has at most pieces, each of which has a slope that is a natural number of size . To put it another way, the function for behaves (up to multiplicative constants) like a piecewise polynomial function, where the degree of the function and number of pieces is bounded by a constant depending on .

One could ask whether the function has any convexity or concavity properties. It turns out that it can exhibit either convex or concave behaviour (or a combination of both). For instance, if is contained in a large finite group, then will eventually plateau to a constant, exhibiting concave behaviour. On the other hand, in nilpotent groups one can see convex behaviour; for instance, in the Heisenberg group , if one sets to be a set of matrices of the form for some large (abusing the notation somewhat), then grows cubically for but then grows quartically for .

To prove this proposition, it turns out (after using a somewhat difficult inverse theorem proven previously by Breuillard, Green, and myself) that one has to analyse the volume growth of nilprogressions . In the “infinitely proper” case where there are no unexpected relations between the generators of the nilprogression, one can lift everything to a simply connected Lie group (where one can take logarithms and exploit the Baker-Campbell-Hausdorff formula heavily), eventually describing with fair accuracy by a certain convex polytope with vertices depending polynomially on , which implies that depends polynomially on up to constants. If one is not in the “infinitely proper” case, then at some point the nilprogression develops a “collision”, but then one can use this collision to show (after some work) that the dimension of the “Lie model” of has dropped by at least one from the dimension of (the notion of a Lie model being developed in the previously mentioned paper of Breuillard, Greenm, and myself), so that this sort of collision can only occur a bounded number of times, with essentially polynomial volume growth behaviour between these collisions.

The arguments also give a precise description of the location of a set for which grows polynomially in . In the symmetric case, what ends up happening is that becomes commensurate to a “coset nilprogression” of bounded rank and nilpotency class, whilst is “virtually” contained in a scaled down version of that nilprogression. What “virtually” means is a little complicated; roughly speaking, it means that there is a set of bounded cardinality such that for all . Conversely, if is virtually contained in , then is commensurate to (and more generally, is commensurate to for any natural number ), giving quite a (qualitatively) precise description of in terms of coset nilprogressions.

The main tool used to prove these results is the structure theorem for approximate groups established by Breuillard, Green, and myself, which roughly speaking asserts that approximate groups are always commensurate with coset nilprogressions. A key additional trick is a pigeonholing argument of Sanders, which in this context is the assertion that if is comparable to , then there is an between and such that is very close in size to (up to a relative error of ). It is this fact, together with the comparability of to a coset nilprogression , that allows us (after some combinatorial argument) to virtually place inside .

Similar arguments apply when discussing iterated convolutions of (symmetric) probability measures on a (discrete) group , rather than combinatorial powers of a finite set. Here, the analogue of volume is given by the negative power of the norm of (thought of as a non-negative function on of total mass 1). One can also work with other norms here than , but this norm has some minor technical conveniences (and other measures of the “spread” of end up being more or less equivalent for our purposes). There is an analogous structure theorem that asserts that if spreads at most polynomially in , then is “commensurate” with the uniform probability distribution on a coset progression , and itself is largely concentrated near . The factor of here is the familiar scaling factor in random walks that arises for instance in the central limit theorem. The proof of (the precise version of) this statement proceeds similarly to the combinatorial case, using pigeonholing to locate a scale where has almost the same norm as .

A special case of this theory occurs when is the uniform probability measure on elements of and their inverses. The probability measure is then the distribution of a random product , where each is equal to one of or its inverse , selected at random with drawn uniformly from with replacement. This is very close to the Littlewood-Offord situation of random products where each is equal to or selected independently at random (thus is now fixed to equal rather than being randomly drawn from . In the case when is abelian, it turns out that a little bit of Fourier analysis shows that these two random walks have “comparable” distributions in a certain sense. As a consequence, the results in this paper can be used to recover an essentially optimal abelian inverse Littlewood-Offord theorem of Nguyen and Vu. In the nonabelian case, the only Littlewood-Offord theorem I am aware of is a recent result of Tiep and Vu for matrix groups, but in this case I do not know how to relate the above two random walks to each other, and so we can only obtain an analogue of the Tiep-Vu results for the symmetrised random walk instead of the ordered random walk .

Filed under: math.CO, math.PR, paper Tagged: additive combinatorics, Gromov's theorem ]]>

I may as well take this opportunity to upload some slides of my own talks on this subject: here are my slides on small and large gaps between the primes that I gave at the “Latinos in the Mathematical Sciences” back in April, and here are my slides on the Polymath project for the Schock Prize symposiumÂ last October. Â (I also gave an abridged version of the latter talk at an AAAS Symposium in February, as well as the Breakthrough Symposium from last November.)

Filed under: math.NT, talk Tagged: polymath8 ]]>

Even if is not normal in , it turns out that the conjugation map *approximately* preserves , if is bounded. To quantify this, let us call two subgroups *-commensurate* for some if one has

Proposition 1Let be groups, with finite index . Then for every , the groups and are -commensurate, in fact

*Proof:* One can partition into left translates of , as well as left translates of . Combining the partitions, we see that can be partitioned into at most non-empty sets of the form . Each of these sets is easily seen to be a left translate of the subgroup , thus . Since

and , we obtain the claim.

One can replace the inclusion by commensurability, at the cost of some worsening of the constants:

Corollary 2Let be -commensurate subgroups of . Then for every , the groups and are -commensurate.

*Proof:* Applying the previous proposition with replaced by , we see that for every , and are -commensurate. Since and have index at most in and respectively, the claim follows.

It turns out that a similar phenomenon holds for the more general concept of an *approximate group*, and gives a “classification” of all the approximate groups containing a given approximate group as a “bounded index approximate subgroup”. Recall that a -approximate group in a group for some is a symmetric subset of containing the identity, such that the product set can be covered by at most left translates of (and thus also right translates, by the symmetry of ). For simplicity we will restrict attention to finite approximate groups so that we can use their cardinality as a measure of size. We call finite two approximate groups *-commensurate* if one has

note that this is consistent with the previous notion of commensurability for genuine groups.

Theorem 3Let be a group, and let be real numbers. Let be a finite -approximate group, and let be a symmetric subset of that contains .

- (i) If is a -approximate group with , then one has for some set of cardinality at most . Furthermore, for each , the approximate groups and are -commensurate.
- (ii) Conversely, if for some set of cardinality at most , and and are -commensurate for all , then , and is a -approximate group.

Informally, the assertion that is an approximate group containing as a “bounded index approximate subgroup” is equivalent to being covered by a bounded number of shifts of , where approximately normalises in the sense that and have large intersection. Thus, to classify all such , the problem essentially reduces to that of classifying those that approximately normalise .

To prove the theorem, we recall some standard lemmas from arithmetic combinatorics, which are the foundation stones of the “Ruzsa calculus” that we will use to establish our results:

Lemma 4 (Ruzsa covering lemma)If and are finite non-empty subsets of , then one has for some set with cardinality .

*Proof:* We take to be a subset of with the property that the translates are disjoint in , and such that is maximal with respect to set inclusion. The required properties of are then easily verified.

Lemma 5 (Ruzsa triangle inequality)If are finite non-empty subsets of , then

*Proof:* If is an element of with and , then from the identity we see that can be written as the product of an element of and an element of in at least distinct ways. The claim follows.

Now we can prove (i). By the Ruzsa covering lemma, can be covered by at most

left-translates of , and hence by at most left-translates of , thus for some . Since only intersects if , we may assume that

and hence for any

By the Ruzsa covering lemma again, this implies that can be covered by at most left-translates of , and hence by at most left-translates of . By the pigeonhole principle, there thus exists a group element with

Since

and

the claim follows.

Now we prove (ii). Clearly

Now we control the size of . We have

From the Ruzsa triangle inequality and symmetry we have

and thus

By the Ruzsa covering lemma, this implies that is covered by at most left-translates of , hence by at most left-translates of . Since , the claim follows.

We now establish some auxiliary propositions about commensurability of approximate groups. The first claim is that commensurability is approximately transitive:

Proposition 6Let be a -approximate group, be a -approximate group, and be a -approximate group. If and are -commensurate, and and are -commensurate, then and are -commensurate.

*Proof:* From two applications of the Ruzsa triangle inequality we have

By the Ruzsa covering lemma, we may thus cover by at most left-translates of , and hence by left-translates of . By the pigeonhole principle, there thus exists a group element such that

and so by arguing as in the proof of part (i) of the theorem we have

and similarly

and the claim follows.

The next proposition asserts that the union and (modified) intersection of two commensurate approximate groups is again an approximate group:

Proposition 7Let be a -approximate group, be a -approximate group, and suppose that and are -commensurate. Then is a -approximate subgroup, and is a -approximate subgroup.

Using this proposition, one may obtain a variant of the previous theorem where the containment is replaced by commensurability; we leave the details to the interested reader.

*Proof:* We begin with . Clearly is symmetric and contains the identity. We have . The set is already covered by left translates of , and hence of ; similarly is covered by left translates of . As for , we observe from the Ruzsa triangle inequality that

and hence by the Ruzsa covering lemma, is covered by at most left translates of , and hence by left translates of , and hence of . Similarly is covered by at most left translates of . The claim follows.

Now we consider . Again, this is clearly symmetric and contains the identity. Repeating the previous arguments, we see that is covered by at most left-translates of , and hence there exists a group element with

Now observe that

and so by the Ruzsa covering lemma, can be covered by at most left-translates of . But this latter set is (as observed previously) contained in , and the claim follows.

Filed under: expository, math.CO, math.GR Tagged: additive combinatorics, approximate groups, Ruzsa calculus ]]>

and so one can conjecture that one has

when is even, and

when is odd. This is obvious in the even case since is a polynomial of degree , but I struggled for a while with the odd case before finding a slick three-line proof. (I was first trying to prove the weaker statement that was non-negative, but for some strange reason I was only able to establish this by working out the derivative exactly, rather than by using more analytic methods, such as convexity arguments.) I thought other readers might like the challenge (and also I’d like to see some other proofs), so rather than post my own proof immediately, I’ll see if anyone would like to supply their own proofs or thoughts in the comments. Also I am curious to know if this identity is connected to any other existing piece of mathematics.

Filed under: math.CA, question ]]>

is not known to be bounded for any to , although it is conjectured to do so when and . (For well below , one can use additive combinatorics constructions to demonstrate unboundedness; see this paper of Demeter.) One can approach this problem by considering the truncated trilinear Hilbert transforms

for . It is not difficult to show that the boundedness of is equivalent to the boundedness of with bounds that are uniform in and . On the other hand, from Minkowski’s inequality and Hölder’s inequality one can easily obtain the *non-uniform* bound of for . The main result of this paper is a slight improvement of this trivial bound to as . Roughly speaking, the way this gain is established is as follows. First there are some standard time-frequency type reductions to reduce to the task of obtaining some non-trivial cancellation on a single “tree”. Using a “generalised von Neumann theorem”, we show that such cancellation will happen if (a discretised version of) one or more of the functions (or a dual function that it is convenient to test against) is small in the Gowers norm. However, the arithmetic regularity lemma alluded to earlier allows one to represent an arbitrary function , up to a small error, as the sum of such a “Gowers uniform” function, plus a structured function (or more precisely, an *irrational virtual nilsequence*). This effectively reduces the problem to that of establishing some cancellation in a single tree in the case when all functions involved are irrational virtual nilsequences. At this point, the contribution of each component of the tree can be estimated using the “counting lemma” from my paper with Ben. The main term in the asymptotics is a certain integral over a nilmanifold, but because the kernel in the trilinear Hilbert transform is odd, it turns out that this integral vanishes, giving the required cancellation.

The same argument works for higher order Hilbert transforms (and one can also replace the coefficients in these transforms with other rational constants). However, because the quantitative bounds in the arithmetic regularity and counting lemmas are so poor, it does not seem likely that one can use these methods to remove the logarithmic growth in entirely, and some additional ideas will be needed to resolve the full conjecture.

Filed under: math.CA, math.CO, paper Tagged: arithmetic regularity lemma, multilinear Hilbert transform ]]>