You are currently browsing the tag archive for the ‘Cauchy-Schwarz’ tag.

This is the final continuation of the online reading seminar of Zhang’s paper for the polymath8 project. (There are two other continuations; this previous post, which deals with the combinatorial aspects of the second part of Zhang’s paper, and this previous post, that covers the Type I and Type II sums.) The main purpose of this post is to present (and hopefully, to improve upon) the treatment of the final and most innovative of the key estimates in Zhang’s paper, namely the Type III estimate.

The main estimate was already stated as Theorem 17 in the previous post, but we quickly recall the relevant definitions here. As in other posts, we always take to be a parameter going off to infinity, with the usual asymptotic notation associated to this parameter.

Definition 1 (Coefficient sequences)Acoefficient sequenceis a finitely supported sequence that obeys the boundsfor all , where is the divisor function.

- (i) If is a coefficient sequence and is a primitive residue class, the (signed)
discrepancyof in the sequence is defined to be the quantity- (ii) A coefficient sequence is said to be
at scalefor some if it is supported on an interval of the form .- (iii) A coefficient sequence at scale is said to be
smoothif it takes the form for some smooth function supported on obeying the derivative boundsfor all fixed (note that the implied constant in the notation may depend on ).

For any , let denote the square-free numbers whose prime factors lie in . The main result of this post is then the following result of Zhang:

Theorem 2 (Type III estimate)Let be fixed quantities, and let be quantities such thatand

and

for some fixed . Let be coefficient sequences at scale respectively with smooth. Then for any we have

In fact we have the stronger “pointwise” estimate

(This is very slightly stronger than previously claimed, in that the condition has been dropped.)

It turns out that Zhang does not exploit any averaging of the factor, and matters reduce to the following:

Theorem 3 (Type III estimate without )Let be fixed, and let be quantities such thatand

and

for some fixed . Let be smooth coefficient sequences at scales respectively. Then we have

for all and some fixed .

Let us quickly see how Theorem 3 implies Theorem 2. To show (4), it suffices to establish the bound

for all , where denotes a quantity that is independent of (but can depend on other quantities such as ). The left-hand side can be rewritten as

From Theorem 3 we have

where the quantity does not depend on or . Inserting this asymptotic and using crude bounds on (see Lemma 8 of this previous post) we conclude (4) as required (after modifying slightly).

It remains to establish Theorem 3. This is done by a set of tools similar to that used to control the Type I and Type II sums:

- (i) completion of sums;
- (ii) the Weil conjectures and bounds on Ramanujan sums;
- (iii) factorisation of smooth moduli ;
- (iv) the Cauchy-Schwarz and triangle inequalities (Weyl differencing).

The specifics are slightly different though. For the Type I and Type II sums, it was the classical Weil bound on Kloosterman sums that were the key source of power saving; Ramanujan sums only played a minor role, controlling a secondary error term. For the Type III sums, one needs a significantly deeper consequence of the Weil conjectures, namely the estimate of Bombieri and Birch on a three-dimensional variant of a Kloosterman sum. Furthermore, the Ramanujan sums – which are a rare example of sums that actually exhibit *better* than square root cancellation, thus going beyond even what the Weil conjectures can offer – make a crucial appearance, when combined with the factorisation of the smooth modulus (this new argument is arguably the most original and interesting contribution of Zhang).

The following result is due independently to Furstenberg and to Sarkozy:

Theorem 1 (Furstenberg-Sarkozy theorem)Let , and suppose that is sufficiently large depending on . Then every subset of of density at least contains a pair for some natural numbers with .

This theorem is of course similar in spirit to results such as Roth’s theorem or Szemerédi’s theorem, in which the pattern is replaced by or for some fixed respectively. There are by now many proofs of this theorem (see this recent paper of Lyall for a survey), but most proofs involve some form of Fourier analysis (or spectral theory). This may be compared with the standard proof of Roth’s theorem, which combines some Fourier analysis with what is now known as the density increment argument.

A few years ago, Ben Green, Tamar Ziegler, and myself observed that it is possible to prove the Furstenberg-Sarkozy theorem by just using the Cauchy-Schwarz inequality (or van der Corput lemma) and the density increment argument, removing all invocations of Fourier analysis, and instead relying on Cauchy-Schwarz to linearise the quadratic shift . As such, this theorem can be considered as even more elementary than Roth’s theorem (and its proof can be viewed as a toy model for the proof of Roth’s theorem). We ended up not doing too much with this observation, so decided to share it here.

The first step is to use the density increment argument that goes back to Roth. For any , let denote the assertion that for sufficiently large, all sets of density at least contain a pair with non-zero. Note that is vacuously true for . We will show that for any , one has the implication

for some absolute constant . This implies that is true for any (as can be seen by considering the infimum of all for which holds), which gives Theorem 1.

It remains to establish the implication (1). Suppose for sake of contradiction that we can find for which holds (for some sufficiently small absolute constant ), but fails. Thus, we can find arbitrarily large , and subsets of of density at least , such that contains no patterns of the form with non-zero. In particular, we have

(The exact ranges of and are not too important here, and could be replaced by various other small powers of if desired.)

Let be the density of , so that . Observe that

and

If we thus set , then

In particular, for large enough,

On the other hand, one easily sees that

and hence by the Cauchy-Schwarz inequality

which we can rearrange as

Shifting by we obtain (again for large enough)

In particular, by the pigeonhole principle (and deleting the diagonal case , which we can do for large enough) we can find distinct such that

so in particular

If we set and shift by , we can simplify this (again for large enough) as

we have

for any , and thus

Averaging this with (2) we conclude that

In particular, by the pigeonhole principle we can find such that

or equivalently has density at least on the arithmetic progression , which has length and spacing , for some absolute constant . By partitioning this progression into subprogressions of spacing and length (plus an error set of size , we see from the pigeonhole principle that we can find a progression of length and spacing on which has density at least (and hence at least ) for some absolute constant . If we then apply the induction hypothesis to the set

we conclude (for large enough) that contains a pair for some natural numbers with non-zero. This implies that lie in , a contradiction, establishing the implication (1).

A more careful analysis of the above argument reveals a more quantitative version of Theorem 1: for (say), any subset of of density at least for some sufficiently large absolute constant contains a pair with non-zero. This is not the best bound known; a (difficult) result of Pintz, Steiger, and Szemeredi allows the density to be as low as . On the other hand, this already improves on the (simpler) Fourier-analytic argument of Green that works for densities at least (although the original argument of Sarkozy, which is a little more intricate, works up to ). In the other direction, a construction of Rusza gives a set of density without any pairs .

Remark 1A similar argument also applies with replaced by for fixed , because this sort of pattern is preserved by affine dilations into arithmetic progressions whose spacing is a power. By re-introducing Fourier analysis, one can also perform an argument of this type for where is the sum of two squares; see the above-mentioned paper of Green for details. However there seems to be some technical difficulty in extending it to patterns of the form for polynomials that consist of more than a single monomial (and with the normalisation , to avoid local obstructions), because one no longer has this preservation property.

This week I was in Columbus, Ohio, attending a conference on equidistribution on manifolds. I talked about my recent paper with Ben Green on the quantitative behaviour of polynomial sequences in nilmanifolds, which I have blogged about previously. During my talk (and inspired by the immediately preceding talk of Vitaly Bergelson), I stated explicitly for the first time a generalisation of the van der Corput trick which morally underlies our paper, though it is somewhat buried there as we specialised it to our application at hand (and also had to deal with various quantitative issues that made the presentation more complicated). After the talk, several people asked me for a more precise statement of this trick, so I am presenting it here, and as an application reproving an old theorem of Leon Green that gives a necessary and sufficient condition as to whether a linear sequence on a nilmanifold is equidistributed, which generalises the famous theorem of Weyl on equidistribution of polynomials.

UPDATE, Feb 2013: It has been pointed out to me by Pavel Zorin that this argument does not fully recover the theorem of Leon Green; to cover all cases, one needs the more complicated van der Corput argument in our paper.

In the previous lecture, we studied the recurrence properties of compact systems, which are systems in which all measurable functions exhibit almost periodicity – they almost return completely to themselves after repeated shifting. Now, we consider the opposite extreme of *mixing systems* – those in which all measurable functions (of mean zero) exhibit *mixing* – they become orthogonal to themselves after repeated shifting. (Actually, there are two different types of mixing, *strong mixing* and *weak mixing*, depending on whether the orthogonality occurs individually or on the average; it is the latter concept which is of more importance to the task of establishing the Furstenberg recurrence theorem.)

We shall see that for weakly mixing systems, averages such as can be computed very explicitly (in fact, this average converges to the constant ). More generally, we shall see that weakly mixing components of a system tend to average themselves out and thus become irrelevant when studying many types of ergodic averages. Our main tool here will be the humble Cauchy-Schwarz inequality, and in particular a certain consequence of it, known as the *van der Corput lemma*.

As one application of this theory, we will be able to establish Roth’s theorem (the k=3 case of Szemerédi’s theorem).

## Recent Comments