In the modern theory of additive combinatorics, a large role is played by the *Gowers uniformity norms* , where , is a finite abelian group, and is a function (one can also consider these norms in finite approximate groups such as instead of finite groups, but we will focus on the group case here for simplicity). These norms can be defined by the formula

where we use the averaging notation

for any non-empty finite set (with denoting the cardinality of ), and is the multiplicative discrete derivative operator

One reason why these norms play an important role is that they control various multilinear averages. We give two sample examples here:

We establish these claims a little later in this post.

In some more recent literature (e.g., this paper of Conlon, Fox, and Zhao), the role of Gowers norms have been replaced by (generalisations) of the *cut norm*, a concept originating from graph theory. In this blog post, it will be convenient to define these cut norms in the language of probability theory (using boldface to denote random variables).

Definition 2 (Cut norm)Let be independent random variables with ; to avoid minor technicalities we assume that these random variables are discrete and take values in a finite set. Given a random variable of these independent random variables, we define thecut normwhere the supremum ranges over all choices of random variables that are -bounded (thus surely), and such that does not depend on .

If , we abbreviate as .

Strictly speaking, the cut norm is only a cut semi-norm when , but we will abuse notation by referring to it as a norm nevertheless.

Example 3If is a bipartite graph, and , are independent random variables chosen uniformly from respectively, thenwhere the supremum ranges over all -bounded functions , . The right hand side is essentially the cut norm of the graph , as defined for instance by Frieze and Kannan.

The cut norm is basically an expectation when :

Example 4If , we see from definition thatIf , one easily checks that

where is the conditional expectation of to the -algebra generated by all the variables other than , i.e., the -algebra generated by . In particular, if are independent random variables drawn uniformly from respectively, then

Here are some basic properties of the cut norm:

Lemma 5 (Basic properties of cut norm)Let be independent discrete random variables, and a function of these variables.

- (i) (Permutation invariance) The cut norm is invariant with respect to permutations of the , or permutations of the .
- (ii) (Conditioning) One has
where on the right-hand side we view, for each realisation of , as a function of the random variables alone, thus the right-hand side may be expanded as

- (iii) (Monotonicity) If , we have
- (iv) (Multiplicative invariances) If is a -bounded function that does not depend on one of the , then
In particular, if we additionally assume , then

- (v) (Cauchy-Schwarz) If , one has
where is a copy of that is independent of and is the random variable

- (vi) (Averaging) If and , where is another random variable independent of , and is a random variable depending on both and , then

*Proof:* The claims (i), (ii) are clear from expanding out all the definitions. The claim (iii) also easily follows from the definitions (the left-hand side involves a supremum over a more general class of multipliers , while the right-hand side omits the multiplier), as does (iv) (the multiplier can be absorbed into one of the multipliers in the definition of the cut norm). The claim (vi) follows by expanding out the definitions, and observing that all of the terms in the supremum appearing in the left-hand side also appear as terms in the supremum on the right-hand side. It remains to prove (v). By definition, the left-hand side is the supremum over all quantities of the form

where the are -bounded functions of that do not depend on . We average out in the direction (that is, we condition out the variables ), and pull out the factor (which does not depend on ), to write this as

which by Cauchy-Schwarz is bounded by

which can be expanded using the copy as

Expanding

and noting that each is -bounded and independent of for , we obtain the claim.

Now we can relate the cut norm to Gowers uniformity norms:

Lemma 6Let be a finite abelian group, let be independent random variables uniformly drawn from for some , and let . ThenIf is additionally assumed to be -bounded, we have the converse inequalities

*Proof:* Applying Lemma 5(v) times, we can bound

where are independent copies of that are also independent of . The expression inside the norm can also be written as

so by Example 4 one can write (6) as

which after some change of variables simplifies to

which by Cauchy-Schwarz is bounded by

which one can rearrange as

giving (2). A similar argument bounds

by

which gives (3).

For (4), we can reverse the above steps and expand as

which we can write as

for some -bounded function . This can in turn be expanded as

for some -bounded functions that do not depend on . By Example 4, this can be written as

which by several applications of Theorem 5(iii) and then Theorem 5(iv) can be bounded by

giving (4). A similar argument gives (5).

Now we can prove Proposition 1. We begin with part (i). By permutation we may assume , then by translation we may assume . Replacing by and by , we can write the left-hand side of (1) as

where

is a -bounded function that does not depend on . Taking to be independent random variables drawn uniformly from , the left-hand side of (1) can then be written as

which by Example 4 is bounded in magnitude by

After many applications of Lemma 5(iii), (iv), this is bounded by

By Lemma 5(ii) we may drop the variable, and then the claim follows from Lemma 6.

For part (ii), we replace by and by to write the left-hand side as

the point here is that the first factor does not involve , the second factor does not involve , and the third factor has no quadratic terms in . Letting be independent variables drawn uniformly from , we can use Example 4 to bound this in magnitude by

which by Lemma 5(i),(iii),(iv) is bounded by

and then by Lemma 5(v) we may bound this by

which by Example 4 is

Now the expression inside the expectation is the product of four factors, each of which is or applied to an affine form where depends on and is one of , , , . With probability , the four different values of are distinct, and then by part (i) we have

When they are not distinct, we can instead bound this quantity by . Taking expectations in , we obtain the claim.

The analogue of the inverse theorem for cut norms is the following claim (which I learned from Ben Green):

Lemma 7 (-type inverse theorem)Let be independent random variables drawn from a finite abelian group , and let be -bounded. Then we havewhere is the group of homomorphisms is a homomorphism from to , and .

*Proof:* Suppose first that for some , then by definition

for some -bounded . By Fourier expansion, the left-hand side is also

where . From Plancherel’s theorem we have

hence by Hölder’s inequality one has for some , and hence

Conversely, suppose (7) holds. Then there is such that

which on substitution and Example 4 implies

The term splits into the product of a factor not depending on , and a factor not depending on . Applying Lemma 5(iii), (iv) we conclude that

The claim follows.

The higher order inverse theorems are much less trivial (and the optimal quantitative bounds are not currently known). However, there is a useful *degree lowering* argument, due to Peluse and Prendiville, that can allow one to lower the order of a uniformity norm in some cases. We give a simple version of this argument here:

Lemma 8 (Degree lowering argument, special case)Let be a finite abelian group, let be a non-empty finite set, and let be a function of the form for some -bounded functions indexed by . Suppose thatfor some and . Then one of the following claims hold (with implied constants allowed to depend on ):

- (i) (Degree lowering) one has .
- (ii) (Non-zero frequency) There exist and non-zero such that

There are more sophisticated versions of this argument in which the frequency is “minor arc” rather than “zero frequency”, and then the Gowers norms are localised to suitable large arithmetic progressions; this is implicit in the above-mentioned paper of Peluse and Prendiville.

*Proof:* One can write

and hence we conclude that

for a set of tuples of density . Applying Lemma 6 and Lemma 7, we see that for each such tuple, there exists such that

where is drawn uniformly from .

Let us adopt the convention that vanishes for not in , then from Lemma 5(ii) we have

where are independent random variables drawn uniformly from and also independent of . By repeated application of Lemma 5(iii) we then have

Expanding out and using Lemma 5(iv) repeatedly we conclude that

From definition of we then have

By Lemma 5(vi), we see that the left-hand side is less than

where is drawn uniformly from , independently of . By repeated application of Lemma 5(i), (v) repeatedly, we conclude that

where are independent copies of that are also independent of , . By Lemma 5(ii) and Example 4 we conclude that

with probability .

The left-hand side can be rewritten as

where is the additive version of , thus

Translating , we can simplify this a little to

If the frequency is ever non-vanishing in the event (9) then conclusion (ii) applies. We conclude that

with probability . In particular, by the pigeonhole principle, there exist such that

with probability . Expanding this out, we obtain a representation of the form

holding with probability , where the are functions that do not depend on the coordinate. From (8) we conclude that

for of the tuples . Thus by Lemma 5(ii)

By repeated application of Lemma 5(iii) we then have

and then by repeated application of Lemma 5(iv)

and then the conclusion (i) follows from Lemma 6.

As an application of degree lowering, we give an inverse theorem for the average in Proposition 1(ii), first established by Bourgain-Chang and later reproved by Peluse (by different methods from those given here):

Proposition 9Let be a cyclic group of prime order. Suppose that one has -bounded functions such thatfor some . Then either , or one has

We remark that a modification of the arguments below also give .

*Proof:* The left-hand side of (10) can be written as

where is the *dual function*

By Cauchy-Schwarz one thus has

and hence by Proposition 1, we either have (in which case we are done) or

Writing with , we conclude that either , or that

for some and non-zero . The left-hand side can be rewritten as

where and . We can rewrite this in turn as

which is bounded by

where are independent random variables drawn uniformly from . Applying Lemma 5(v), we conclude that

However, a routine Gauss sum calculation reveals that the left-hand side is for some absolute constant because is non-zero, so that . The only remaining case to consider is when

Repeating the above arguments we then conclude that

and then

The left-hand side can be computed to equal , and the claim follows.

This argument was given for the cyclic group setting, but the argument can also be applied to the integers (see Peluse-Prendiville) and can also be used to establish an analogue over the reals (that was first obtained by Bourgain).

## 13 comments

Comments feed for this article

9 March, 2020 at 5:09 am

Will SawinLemma 6 implies, roughly speaking, that Gowers norms can be replaced with cut norms in any area where very precise bounds are not important. Is the post intended to suggest that it is often a good idea to do this, as the proofs will become more elegant? I’m not familiar enough with the more traditional proofs of these results to know off the top of my head if these ones are substantially simpler.

Is there a reason the application of example 4 and Lemma 5(iv) in the proof of Proposition 1 is not done using the definition?

There is a typo (“remainin gcase”) near the end.

9 March, 2020 at 7:53 am

Terence TaoThanks for the correction. I think both Gowers norms and cut norms have their place in additive combinatorics; they are nearly equivalent, so the advantages to preferring one over the other are slight, but the point I wanted to make here was that there was a relatively clean calculus for cut norms which can be used for complicated Cauchy-Schwarz + triangle inequality + change of variables type arguments, whereas the analogous computations for Gowers norms are somewhat more notationally awkward (in particular, the Gowers-Cauchy-Schwarz inequality has to be twisted a fair deal before it can be used as broadly as Lemma 5(v) is here). One drawback of the cut norm approach is that it tends to rely more heavily on all the functions involved being bounded, but the “densification” technology in the Conlon-Fox-Zhao paper mentioned in the blog post goes a long way to ameliorating that (at least in the important case where the functions involved are bounded by a pseudorandom measure).

There is also a potentially useful hierarchy of cut norms, analogous to the hierarchy of slice rank type concepts ranging from true slice rank to tensor rank, in which the additional factors are more constrained in their variable dependencies than just being independent of a single variable. (This is also related to the hierarchy of possible types of hypergraph regularity.) I don’t yet know of any compelling application of these more general cut norms though.

I’m not sure what you are referring to you in your remark concerning the proof of Proposition 1.

9 March, 2020 at 3:55 pm

Will SawinThanks, that’s very enlightening! It would be great to see if this calculus makes it easier to find difficult Cauchy-Schwarz+change-of-variables arguments in the future – these have always seemed to me like they come out of nowhere.

WRT the proof of Proposition 1: After “the left-hand side of (1) can then be written as” and the equation that immediately follows, I think you can write “which is less than or equal to because the supremum over a set is always greater than or equal to any element of the set” and then continue on from there.

But if the point is to demonstrate the effectiveness of the calculus you may want to prove things using the calculus rather than using the definitions.

9 March, 2020 at 8:28 pm

Terence TaoAh, I see what you are saying now. My preference is to work with a more “object oriented” approach by relying on the calculus more than the definitions. (Somewhat annoyingly, I still need to use the definition directly at one stage in the proof of Lemma 8, as I was not able to come up with a clean abstraction of that step that could be made part of the calculus, but presumably such an abstraction is possible.)

[Update: I added such an abstraction to Lemma 5.]9 March, 2020 at 9:10 am

Anonymousthere seems to be a displaying problem in Lemma 8

[Corrected, thanks – T.]9 March, 2020 at 10:33 am

Allan van HulstI have a latex rendering error for Lemma 8.

Screenshot: https://imgur.com/a/ueLaxpP

Rest of the latex-elements in your post render fine in my browser.

[Corrected, thanks – T.]9 March, 2020 at 12:00 pm

Allan van HulstSome more:

finite sert A (below second centered formula from top)

are discrete take values in a finite set (Definition 2)

homomrphisms (Lemma 7)

[Corrected, thanks – T.]9 March, 2020 at 1:22 pm

AnonymousThere is a typo “sert” in the line below the second displayed formula.

13 March, 2020 at 7:00 am

AnonymousIn proposition 1, (ii) there is a typo: 1-boudned instead of 1-bounded. Great post! I wonder what example applications there could be for cut norms (besides providing an alternative to Gowers’ norms)

[Corrected, thanks – T.]30 March, 2020 at 7:53 am

Tao – Ruhollas Blog[…] current article is inspired by this post of Terrence Tao. Let’s first remember some […]

30 March, 2020 at 8:15 am

majdoddinI have proved a stronger version of Proposition 1(i):

The inequality (1) holds for the Gowers universal Norm of any order (not just ) on the right-hand side.

See its proof (and of some other inequalities for Gowers universal norms) here:

https://majdoddin.wordpress.com/2020/03/30/gowers_uniformity_norms/

30 March, 2020 at 9:05 am

Terence TaoI think the function in your link is not well defined (the quantity is not a function purely of , but depends on and separately).

In general it is known that the Gowers exponent in Proposition 1(i) is optimal. A typical counterexample is that if and for some non-zero then even though have vanishing norms. More generally, for length correlations one can use suitably chosen polynomial phases of degree (that are large in norm, but small in lower norms) to demonstrate optimality.

3 August, 2020 at 8:05 pm

Pointwise ergodic theorems for non-conventional bilinear polynomial averages | What's new[…] on the “degree lowering argument” of Peluse and Prendiville, which I discussed in this previous blog post. Crucially for our application, the estimates are very quantitative, with all bounds being […]