You are currently browsing the monthly archive for May 2022.

Let be a finite set of order ; in applications will be typically something like a finite abelian group, such as the cyclic group . Let us define a *-bounded function* to be a function such that for all . There are many seminorms of interest that one places on functions that are bounded by on -bounded functions, such as the Gowers uniformity seminorms for (which are genuine norms for ). All seminorms in this post will be implicitly assumed to obey this property.

In additive combinatorics, a significant role is played by *inverse theorems*, which abstractly take the following form for certain choices of seminorm , some parameters , and some class of -bounded functions:

Theorem 1 (Inverse theorem template)If is a -bounded function with , then there exists such that , where denotes the usual inner product

Informally, one should think of as being somewhat small but fixed independently of , as being somewhat smaller but depending only on (and on the seminorm), and as representing the “structured functions” for these choices of parameters. There is some flexibility in exactly how to choose the class of structured functions, but intuitively an inverse theorem should become more powerful when this class is small. Accordingly, let us define the *-entropy* of the seminorm to be the least cardinality of for which such an inverse theorem holds. Seminorms with low entropy are ones for which inverse theorems can be expected to be a useful tool. This concept arose in some discussions I had with Ben Green many years ago, but never appeared in print, so I decided to record some observations we had on this concept here on this blog.

Lebesgue norms for have exponentially large entropy (and so inverse theorems are not expected to be useful in this case):

Proposition 2 ( norm has exponentially large inverse entropy)Let and . Then the -entropy of is at most . Conversely, for any , the -entropy of is at least for some absolute constant .

*Proof:* If is -bounded with , then we have

Now suppose that there is an -inverse theorem for some of cardinality . If we let be a random sign function (so the are independent random variables taking values in with equal probability), then there is a random such that

and hence by the pigeonhole principle there is a deterministic such that On the other hand, from the Hoeffding inequality one has for some absolute constant , hence as claimed.Most seminorms of interest in additive combinatorics, such as the Gowers uniformity norms, are bounded by some finite norm thanks to Hölder’s inequality, so from the above proposition and the obvious monotonicity properties of entropy, we conclude that all Gowers norms on finite abelian groups have at most exponential inverse theorem entropy. But we can do significantly better than this:

- For the seminorm , one can simply take to consist of the constant function , and the -entropy is clearly equal to for any .
- For the norm, the standard Fourier-analytic inverse theorem asserts that if then for some Fourier character . Thus the -entropy is at most .
- For the norm on cyclic groups for , the inverse theorem proved by Green, Ziegler, and myself gives an -inverse theorem for some and consisting of nilsequences for some filtered nilmanifold of degree in a finite collection of cardinality , some polynomial sequence (which was subsequently observed by Candela-Sisask (see also Manners) that one can choose to be -periodic), and some Lipschitz function of Lipschitz norm . By the Arzela-Ascoli theorem, the number of possible (up to uniform errors of size at most , say) is . By standard arguments one can also ensure that the coefficients of the polynomial are , and then by periodicity there are only such polynomials. As a consequence, the -entropy is of polynomial size (a fact that seems to have first been implicitly observed in Lemma 6.2 of this paper of Frantzikinakis; thanks to Ben Green for this reference). One can obtain more precise dependence on using the quantitative version of this inverse theorem due to Manners; back of the envelope calculations using Section 5 of that paper suggest to me that one can take to be polynomial in and the entropy to be of the order , or alternatively one can reduce the entropy to at the cost of degrading to .
- If one replaces the cyclic group by a vector space over some fixed finite field of prime order (so that ), then the inverse theorem of Ziegler and myself (available in both high and low characteristic) allows one to obtain an -inverse theorem for some and the collection of non-classical degree polynomial phases from to , which one can normalize to equal at the origin, and then by the classification of such polynomials one can calculate that the entropy is of quasipolynomial size in . By using the recent work of Gowers and Milicevic, one can make the dependence on here more precise, but we will not perform these calcualtions here.
- For the norm on an arbitrary finite abelian group, the recent inverse theorem of Jamneshan and myself gives (after some calculations) a bound of the polynomial form on the -entropy for some , which one can improve slightly to if one degrades to , where is the maximal order of an element of , and is the rank (the number of elements needed to generate ). This bound is polynomial in in the cyclic group case and quasipolynomial in general.

For general finite abelian groups , we do not yet have an inverse theorem of comparable power to the ones mentioned above that give polynomial or quasipolynomial upper bounds on the entropy. However, there is a cheap argument that at least gives some subexponential bounds:

Proposition 3 (Cheap subexponential bound)Let and , and suppose that is a finite abelian group of order for some sufficiently large . Then the -complexity of is at most .

*Proof:* (Sketch) We use a standard random sampling argument, of the type used for instance by Croot-Sisask or Briet-Gopi (thanks to Ben Green for this latter reference). We can assume that for some sufficiently large , since otherwise the claim follows from Proposition 2.

Let be a random subset of with the events being iid with probability to be chosen later, conditioned to the event . Let be a -bounded function. By a standard second moment calculation, we see that with probability at least , we have

Thus, by the triangle inequality, if we choose for some sufficiently large , then for any -bounded with , one has with probability at least that We can write the left-hand side as where is the randomly sampled dual function Unfortunately, is not -bounded in general, but we have and the right-hand side can be shown to be on the average, so we can condition on the event that the right-hand side is without significant loss in falure probability.If we then let be rounded to the nearest Gaussian integer multiple of in the unit disk, one has from the triangle inequality that

where is the discretised randomly sampled dual function For any given , there are at most places where can be non-zero, and in those places there are possible values for . Thus, if we let be the collection of all possible associated to a given , the cardinality of this set is , and for any with , we have with probability at least .Now we remove the failure probability by independent resampling. By rounding to the nearest Gaussian integer multiple of in the unit disk for a sufficiently small , one can find a family of cardinality consisting of -bounded functions of norm at least such that for every -bounded with there exists such that

Now, let be independent samples of for some to be chosen later. By the preceding discussion, we see that with probability at least , we have for any given , so by the union bound, if we choose for a large enough , we can find such that for*all*, and hence y the triangle inequality Taking to be the union of the (applying some truncation and rescaling to these -bounded functions to make them -bounded, and then -bounded), we obtain the claim.

One way to obtain *lower* bounds on the inverse theorem entropy is to produce a collection of almost orthogonal functions with large norm. More precisely:

Proposition 4Let be a seminorm, let , and suppose that one has a collection of -bounded functions such that for all , one has for all but at most choices of for all distinct . Then the -entropy of is at least .

*Proof:* Suppose we have an -inverse theorem with some family . Then for each there is such that . By the pigeonhole principle, there is thus such that for all in a subset of of cardinality at least :

Thus for instance:

- For the norm, one can take to be the family of linear exponential phases with and , and obtain a linear lower bound of for the -entropy, thus matching the upper bound of up to constants when is fixed.
- For the norm, a similar calculation using polynomial phases of degree , combined with the Weyl sum estimates, gives a lower bound of for the -entropy for any fixed ; by considering nilsequences as well, together with nilsequence equidistribution theory, one can replace the exponent here by some quantity that goes to infinity as , though I have not attempted to calculate the exact rate.
- For the norm, another similar calculation using polynomial phases of degree should give a lower bound of for the -entropy, though I have not fully performed the calculation.

We close with one final example. Suppose is a product of two sets of cardinality , and we consider the Gowers box norm

One possible choice of class here are the indicators of “rectangles” with , (cf. this previous blog post on cut norms). By standard calculations, one can use this class to show that the -entropy of is , and a variant of the proof of the second part of Proposition 2 shows that this is the correct order of growth in . In contrast, a modification of Proposition 3 only gives an upper bound of the form (the bottleneck is ensuring that the randomly sampled dual functions stay bounded in ), which shows that while this cheap bound is not optimal, it can still broadly give the correct “type” of bound (specifically, intermediate growth between polynomial and exponential).
In orthodox first-order logic, variables and expressions are only allowed to take one value at a time; a variable , for instance, is not allowed to equal and simultaneously. We will call such variables *completely specified*. If one really wants to deal with multiple values of objects simultaneously, one is encouraged to use the language of set theory and/or logical quantifiers to do so.

However, the ability to allow expressions to become only partially specified is undeniably convenient, and also rather intuitive. A classic example here is that of the quadratic formula:

Strictly speaking, the expression is not well-formed according to the grammar of first-order logic; one should instead use something like

or

or

in order to strictly adhere to this grammar. But none of these three reformulations are as compact or as conceptually clear as the original one. In a similar spirit, a mathematical English sentence such as

is also not a first-order sentence; one would instead have to write something like or instead. These reformulations are not all that hard to decipher, but they do have the aesthetically displeasing effect of cluttering an argument with temporary variables such as which are used once and then discarded.Another example of partially specified notation is the innocuous notation. For instance, the assertion

when written formally using first-order logic, would become something like which is not exactly an elegant reformulation. Similarly with statements such as orBelow the fold I’ll try to assign a formal meaning to partially specified expressions such as (1), for instance allowing one to condense (2), (3), (4) to just

When combined with another common (but often implicit) extension of first-order logic, namely the ability to reason using*ambient parameters*, we become able to formally introduce

*asymptotic notation*such as the big-O notation or the little-o notation . We will explain how to do this at the end of this post.

## Recent Comments