You are currently browsing the tag archive for the ‘van der Corput lemma’ tag.

We return to the study of the Riemann zeta function ${\zeta(s)}$, focusing now on the task of upper bounding the size of this function within the critical strip; as seen in Exercise 43 of Notes 2, such upper bounds can lead to zero-free regions for ${\zeta}$, which in turn lead to improved estimates for the error term in the prime number theorem.

In equation (21) of Notes 2 we obtained the somewhat crude estimates

$\displaystyle \zeta(s) = \sum_{n \leq x} \frac{1}{n^s} - \frac{x^{1-s}}{1-s} + O( \frac{|s|}{\sigma} \frac{1}{x^\sigma} ) \ \ \ \ \ (1)$

for any ${x > 0}$ and ${s = \sigma+it}$ with ${\sigma>0}$ and ${s \neq 1}$. Setting ${x=1}$, we obtained the crude estimate

$\displaystyle \zeta(s) = \frac{1}{s-1} + O( \frac{|s|}{\sigma} )$

in this region. In particular, if ${0 < \varepsilon \leq \sigma \ll 1}$ and ${|t| \gg 1}$ then we had ${\zeta(s) = O_\varepsilon( |t| )}$. Using the functional equation and the Hadamard three lines lemma, we can improve this to ${\zeta(s) \ll_\varepsilon |t|^{\frac{1-\sigma}{2}+\varepsilon}}$; see Supplement 3.

Now we seek better upper bounds on ${\zeta}$. We will reduce the problem to that of bounding certain exponential sums, in the spirit of Exercise 33 of Supplement 3:

Proposition 1 Let ${s = \sigma+it}$ with ${0 < \varepsilon \leq \sigma \ll 1}$ and ${|t| \gg 1}$. Then

$\displaystyle \zeta(s) \ll_\varepsilon \log(2+|t|) \sup_{1 \leq M \leq N \ll |t|} N^{1-\sigma} |\frac{1}{N} \sum_{N \leq n < N+M} e( -\frac{t}{2\pi} \log n)|$

where ${e(x) := e^{2\pi i x}}$.

Proof: We fix a smooth function ${\eta: {\bf R} \rightarrow {\bf C}}$ with ${\eta(t)=1}$ for ${t \leq -1}$ and ${\eta(t)=0}$ for ${t \geq 1}$, and allow implied constants to depend on ${\eta}$. Let ${s=\sigma+it}$ with ${\varepsilon \leq \sigma \ll 1}$. From Exercise 33 of Supplement 3, we have

$\displaystyle \zeta(s) = \sum_n \frac{1}{n^s} \eta( \log n - \log C|t| ) + O_\varepsilon( 1 )$

for some sufficiently large absolute constant ${C}$. By dyadic decomposition, we thus have

$\displaystyle \zeta(s) \ll_{\varepsilon} 1 + \log(2+|t|) \sup_{1 \leq N \ll |t|} |\sum_{N \leq n < 2N} \frac{1}{n^s} \eta( \log n - \log C|t| )|.$

We can absorb the first term in the second using the ${N=1}$ case of the supremum. Writing ${\frac{1}{n^s} \eta( \log n - \log|C| t ) = N^{-\sigma} e( - \frac{t}{2\pi} \log n ) F_N(n)}$, where

$\displaystyle F_N(n) := (N/n)^\sigma \eta(\log n - \log C|t| ),$

it thus suffices to show that

$\displaystyle \sum_{N \leq n < 2N} e(-\frac{t}{2\pi} \log N) F_N(n) \ll \sup_{1 \leq M \leq N} |\sum_{N \leq n < N+M} e(-\frac{t}{2\pi} \log n)|$

for each ${N}$. But from the fundamental theorem of calculus, the left-hand side can be written as

$\displaystyle F_N(2N) \sum_{N \leq n < 2N} e(-\frac{t}{2\pi} \log n)$

$\displaystyle - \int_0^{N} (\sum_{N \leq n < N+M} e(-\frac{t}{2\pi} \log n)) F'_N(M)\ dM$

and the claim then follows from the triangle inequality and a routine calculation. $\Box$

We are thus interested in getting good bounds on the sum ${\sum_{N \leq n < N+M} e( -\frac{t}{2\pi} \log n )}$. More generally, we consider normalised exponential sums of the form

$\displaystyle \frac{1}{N} \sum_{n \in I} e( f(n) ) \ \ \ \ \ (2)$

where ${I \subset {\bf R}}$ is an interval of length at most ${N}$ for some ${N \geq 1}$, and ${f: {\bf R} \rightarrow {\bf R}}$ is a smooth function. We will assume smoothness estimates of the form

$\displaystyle |f^{(j)}(x)| = \exp( O(j^2) ) \frac{T}{N^j} \ \ \ \ \ (3)$

for some ${T>0}$, all ${x \in I}$, and all ${j \geq 1}$, where ${f^{(j)}}$ is the ${j}$-fold derivative of ${f}$; in the case ${f(x) := -\frac{t}{2\pi} \log x}$, ${I \subset [N,2N]}$ of interest for the Riemann zeta function, we easily verify that these estimates hold with ${T := |t|}$. (One can consider exponential sums under more general hypotheses than (3), but the hypotheses here are adequate for our needs.) We do not bound the zeroth derivative ${f^{(0)}=f}$ of ${f}$ directly, but it would not be natural to do so in any event, since the magnitude of the sum (2) is unaffected if one adds an arbitrary constant to ${f(n)}$.

The trivial bound for (2) is

$\displaystyle \frac{1}{N} \sum_{n \in I} e(f(n)) \ll 1 \ \ \ \ \ (4)$

and we will seek to obtain significant improvements to this bound. Pseudorandomness heuristics predict a bound of ${O_\varepsilon(N^{-1/2+\varepsilon})}$ for (2) for any ${\varepsilon>0}$ if ${T = O(N^{O(1)})}$; this assertion (a special case of the exponent pair hypothesis) would have many consequences (for instance, inserting it into Proposition 1 soon yields the Lindelöf hypothesis), but is unfortunately quite far from resolution with known methods. However, we can obtain weaker gains of the form ${O(N^{1-c_K})}$ when ${T \ll N^K}$ and ${c_K > 0}$ depends on ${K}$. We present two such results here, which perform well for small and large values of ${K}$ respectively:

Theorem 2 Let ${2 \leq N \ll T}$, let ${I}$ be an interval of length at most ${N}$, and let ${f: I \rightarrow {\bf R}}$ be a smooth function obeying (3) for all ${j \geq 1}$ and ${x \in I}$.

• (i) (van der Corput estimate) For any natural number ${k \geq 2}$, one has

$\displaystyle \frac{1}{N} \sum_{n \in I} e( f(n) ) \ll (\frac{T}{N^k})^{\frac{1}{2^k-2}} \log^{1/2} (2+T). \ \ \ \ \ (5)$

• (ii) (Vinogradov estimate) If ${k}$ is a natural number and ${T \leq N^{k}}$, then

$\displaystyle \frac{1}{N} \sum_{n \in I} e( f(n) ) \ll N^{-c/k^2} \ \ \ \ \ (6)$

for some absolute constant ${c>0}$.

The factor of ${\log^{1/2} (2+T)}$ can be removed by a more careful argument, but we will not need to do so here as we are willing to lose powers of ${\log T}$. The estimate (6) is superior to (5) when ${T \sim N^K}$ for ${K}$ large, since (after optimising in ${k}$) (5) gives a gain of the form ${N^{-c/2^{cK}}}$ over the trivial bound, while (6) gives ${N^{-c/K^2}}$. We have not attempted to obtain completely optimal estimates here, settling for a relatively simple presentation that still gives good bounds on ${\zeta}$, and there are a wide variety of additional exponential sum estimates beyond the ones given here; see Chapter 8 of Iwaniec-Kowalski, or Chapters 3-4 of Montgomery, for further discussion.

We now briefly discuss the strategies of proof of Theorem 2. Both parts of the theorem proceed by treating ${f}$ like a polynomial of degree roughly ${k}$; in the case of (ii), this is done explicitly via Taylor expansion, whereas for (i) it is only at the level of analogy. Both parts of the theorem then try to “linearise” the phase to make it a linear function of the summands (actually in part (ii), it is necessary to introduce an additional variable and make the phase a bilinear function of the summands). The van der Corput estimate achieves this linearisation by squaring the exponential sum about ${k}$ times, which is why the gain is only exponentially small in ${k}$. The Vinogradov estimate achieves linearisation by raising the exponential sum to a significantly smaller power – on the order of ${k^2}$ – by using Hölder’s inequality in combination with the fact that the discrete curve ${\{ (n,n^2,\dots,n^k): n \in \{1,\dots,M\}\}}$ becomes roughly equidistributed in the box ${\{ (a_1,\dots,a_k): a_j = O( M^j ) \}}$ after taking the sumset of about ${k^2}$ copies of this curve. This latter fact has a precise formulation, known as the Vinogradov mean value theorem, and its proof is the most difficult part of the argument, relying on using a “${p}$-adic” version of this equidistribution to reduce the claim at a given scale ${M}$ to a smaller scale ${M/p}$ with ${p \sim M^{1/k}}$, and then proceeding by induction.

One can combine Theorem 2 with Proposition 1 to obtain various bounds on the Riemann zeta function:

Exercise 3 (Subconvexity bound)

• (i) Show that ${\zeta(\frac{1}{2}+it) \ll (1+|t|)^{1/6} \log^{O(1)}(1+|t|)}$ for all ${t \in {\bf R}}$. (Hint: use the ${k=3}$ case of the Van der Corput estimate.)
• (ii) For any ${0 < \sigma < 1}$, show that ${\zeta(\sigma+it) \ll (1+|t|)^{\max( \frac{1-\sigma}{3}, \frac{1}{2} - \frac{2\sigma}{3}) + o(1)}}$ as ${|t| \rightarrow \infty}$.

Exercise 4 Let ${t}$ be such that ${|t| \geq 100}$, and let ${\sigma \geq 1/2}$.

• (i) (Littlewood bound) Use the van der Corput estimate to show that ${\zeta(\sigma+it) \ll \log^{O(1)} |t|}$ whenever ${\sigma \geq 1 - O( \frac{(\log\log |t|)^2}{\log |t|} ))}$.
• (ii) (Vinogradov-Korobov bound) Use the Vinogradov estimate to show that ${\zeta(\sigma+it) \ll \log^{O(1)} |t|}$ whenever ${\sigma \geq 1 - O( \frac{(\log\log |t|)^{2/3}}{\log^{2/3} |t|} )}$.

As noted in Exercise 43 of Notes 2, the Vinogradov-Korobov bound leads to the zero-free region ${\{ \sigma+it: \sigma > 1 - c \frac{1}{(\log |t|)^{2/3} (\log\log |t|)^{1/3}}; |t| \geq 100 \}}$, which in turn leads to the prime number theorem with error term

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + O\left( x \exp\left( - c \frac{\log^{3/5} x}{(\log\log x)^{1/5}} \right) \right)$

for ${x > 100}$. If one uses the weaker Littlewood bound instead, one obtains the narrower zero-free region

$\displaystyle \{ \sigma+it: \sigma > 1 - c \frac{\log\log|t|}{\log |t|}; |t| \geq 100 \}$

(which is only slightly wider than the classical zero-free region) and an error term

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + O( x \exp( - c \sqrt{\log x \log\log x} ) )$

in the prime number theorem.

Exercise 5 (Vinogradov-Korobov in arithmetic progressions) Let ${\chi}$ be a non-principal character of modulus ${q}$.

• (i) (Vinogradov-Korobov bound) Use the Vinogradov estimate to show that ${L(\sigma+it,\chi) \ll \log^{O(1)}(q|t|)}$ whenever ${|t| \geq 100}$ and

$\displaystyle \sigma \geq 1 - O( \min( \frac{\log\log(q|t|)}{\log q}, \frac{(\log\log(q|t|))^{2/3}}{\log^{2/3} |t|} ) ).$

(Hint: use the Vinogradov estimate and a change of variables to control ${\sum_{n \in I: n = a\ (q)} \exp( -it \log n)}$ for various intervals ${I}$ of length at most ${N}$ and residue classes ${a\ (q)}$, in the regime ${N \geq q^2}$ (say). For ${N < q^2}$, do not try to capture any cancellation and just use the triangle inequality instead.)

• (ii) Obtain a zero-free region

$\displaystyle \{ \sigma+it: \sigma > 1 - c \min( \frac{1}{(\log |t|)^{2/3} (\log\log |t|)^{1/3}}, \frac{1}{\log q} );$

$\displaystyle |t| \geq 100 \}$

for ${L(s,\chi)}$, for some (effective) absolute constant ${c>0}$.

• (iii) Obtain the prime number theorem in arithmetic progressions with error term

$\displaystyle \sum_{n \leq x: n = a\ (q)} \Lambda(n) = x + O\left( x \exp\left( - c_A \frac{\log^{3/5} x}{(\log\log x)^{1/5}} \right) \right)$

whenever ${x > 100}$, ${q \leq \log^A x}$, ${a\ (q)}$ is primitive, and ${c_A>0}$ depends (ineffectively) on ${A}$.

The following result is due independently to Furstenberg and to Sarkozy:

Theorem 1 (Furstenberg-Sarkozy theorem) Let ${\delta > 0}$, and suppose that ${N}$ is sufficiently large depending on ${\delta}$. Then every subset ${A}$ of ${[N] := \{1,\ldots,N\}}$ of density ${|A|/N}$ at least ${\delta}$ contains a pair ${n, n+r^2}$ for some natural numbers ${n, r}$ with ${r \neq 0}$.

This theorem is of course similar in spirit to results such as Roth’s theorem or Szemerédi’s theorem, in which the pattern ${n,n+r^2}$ is replaced by ${n,n+r,n+2r}$ or ${n,n+r,\ldots,n+(k-1)r}$ for some fixed ${k}$ respectively. There are by now many proofs of this theorem (see this recent paper of Lyall for a survey), but most proofs involve some form of Fourier analysis (or spectral theory). This may be compared with the standard proof of Roth’s theorem, which combines some Fourier analysis with what is now known as the density increment argument.

A few years ago, Ben Green, Tamar Ziegler, and myself observed that it is possible to prove the Furstenberg-Sarkozy theorem by just using the Cauchy-Schwarz inequality (or van der Corput lemma) and the density increment argument, removing all invocations of Fourier analysis, and instead relying on Cauchy-Schwarz to linearise the quadratic shift ${r^2}$. As such, this theorem can be considered as even more elementary than Roth’s theorem (and its proof can be viewed as a toy model for the proof of Roth’s theorem). We ended up not doing too much with this observation, so decided to share it here.

The first step is to use the density increment argument that goes back to Roth. For any ${\delta > 0}$, let ${P(\delta)}$ denote the assertion that for ${N}$ sufficiently large, all sets ${A \subset [N]}$ of density at least ${\delta}$ contain a pair ${n,n+r^2}$ with ${r}$ non-zero. Note that ${P(\delta)}$ is vacuously true for ${\delta > 1}$. We will show that for any ${0 < \delta_0 \leq 1}$, one has the implication

$\displaystyle P(\delta_0 + c \delta_0^3) \implies P(\delta_0) \ \ \ \ \ (1)$

for some absolute constant ${c>0}$. This implies that ${P(\delta)}$ is true for any ${\delta>0}$ (as can be seen by considering the infimum of all ${\delta>0}$ for which ${P(\delta)}$ holds), which gives Theorem 1.

It remains to establish the implication (1). Suppose for sake of contradiction that we can find ${0 < \delta_0 \leq 1}$ for which ${P(\delta_0+c\delta^3_0)}$ holds (for some sufficiently small absolute constant ${c>0}$), but ${P(\delta_0)}$ fails. Thus, we can find arbitrarily large ${N}$, and subsets ${A}$ of ${[N]}$ of density at least ${\delta_0}$, such that ${A}$ contains no patterns of the form ${n,n+r^2}$ with ${r}$ non-zero. In particular, we have

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h \in [N^{1/100}]} 1_A(n) 1_A(n+(r+h)^2) = 0.$

(The exact ranges of ${r}$ and ${h}$ are not too important here, and could be replaced by various other small powers of ${N}$ if desired.)

Let ${\delta := |A|/N}$ be the density of ${A}$, so that ${\delta_0 \leq \delta \leq 1}$. Observe that

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h \in [N^{1/100}]} 1_A(n) \delta 1_{[N]}(n+(r+h)^2) = \delta^2 + O(N^{-1/3})$

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h \in [N^{1/100}]} \delta 1_{[N]}(n) \delta 1_{[N]}(n+(r+h)^2) = \delta^2 + O(N^{-1/3})$

and

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h \in [N^{1/100}]} \delta 1_{[N]}(n) 1_A(n+(r+h)^2) = \delta^2 + O( N^{-1/3} ).$

If we thus set ${f := 1_A - \delta 1_{[N]}}$, then

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h \in [N^{1/100}]} f(n) f(n+(r+h)^2) = -\delta^2 + O( N^{-1/3} ).$

In particular, for ${N}$ large enough,

$\displaystyle \mathop{\bf E}_{n \in [N]} |f(n)| \mathop{\bf E}_{r \in [N^{1/3}]} |\mathop{\bf E}_{h \in [N^{1/100}]} f(n+(r+h)^2)| \gg \delta^2.$

On the other hand, one easily sees that

$\displaystyle \mathop{\bf E}_{n \in [N]} |f(n)|^2 = O(\delta)$

and hence by the Cauchy-Schwarz inequality

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} |\mathop{\bf E}_{h \in [N^{1/100}]} f(n+(r+h)^2)|^2 \gg \delta^3$

which we can rearrange as

$\displaystyle |\mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h,h' \in [N^{1/100}]} \mathop{\bf E}_{n \in [N]} f(n+(r+h)^2) f(n+(r+h')^2)| \gg \delta^3.$

Shifting ${n}$ by ${(r+h)^2}$ we obtain (again for ${N}$ large enough)

$\displaystyle |\mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{h,h' \in [N^{1/100}]} \mathop{\bf E}_{n \in [N]} f(n) f(n+(h'-h)(2r+h'+h))| \gg \delta^3.$

In particular, by the pigeonhole principle (and deleting the diagonal case ${h=h'}$, which we can do for ${N}$ large enough) we can find distinct ${h,h' \in [N^{1/100}]}$ such that

$\displaystyle |\mathop{\bf E}_{r \in [N^{1/3}]} \mathop{\bf E}_{n \in [N]} f(n) f(n+(h'-h)(2r+h'+h))| \gg \delta^3,$

so in particular

$\displaystyle \mathop{\bf E}_{n \in [N]} |\mathop{\bf E}_{r \in [N^{1/3}]} f(n+(h'-h)(2r+h'+h))| \gg \delta^3.$

If we set ${d := 2(h'-h)}$ and shift ${n}$ by ${(h'-h) (h'+h)}$, we can simplify this (again for ${N}$ large enough) as

$\displaystyle \mathop{\bf E}_{n \in [N]} |\mathop{\bf E}_{r \in [N^{1/3}]} f(n+dr)| \gg \delta^3. \ \ \ \ \ (2)$

On the other hand, since

$\displaystyle \mathop{\bf E}_{n \in [N]} f(n) = 0$

we have

$\displaystyle \mathop{\bf E}_{n \in [N]} f(n+dr) = O( N^{-2/3+1/100})$

for any ${r \in [N^{1/3}]}$, and thus

$\displaystyle \mathop{\bf E}_{n \in [N]} \mathop{\bf E}_{r \in [N^{1/3}]} f(n+dr) = O( N^{-2/3+1/100}).$

Averaging this with (2) we conclude that

$\displaystyle \mathop{\bf E}_{n \in [N]} \max( \mathop{\bf E}_{r \in [N^{1/3}]} f(n+dr), 0 ) \gg \delta^3.$

In particular, by the pigeonhole principle we can find ${n \in [N]}$ such that

$\displaystyle \mathop{\bf E}_{r \in [N^{1/3}]} f(n+dr) \gg \delta^3,$

or equivalently ${A}$ has density at least ${\delta+c'\delta^3}$ on the arithmetic progression ${\{ n+dr: r \in [N^{1/3}]\}}$, which has length ${\lfloor N^{1/3}\rfloor }$ and spacing ${d}$, for some absolute constant ${c'>0}$. By partitioning this progression into subprogressions of spacing ${d^2}$ and length ${\lfloor N^{1/4}\rfloor}$ (plus an error set of size ${O(N^{1/4})}$, we see from the pigeonhole principle that we can find a progression ${\{ n' + d^2 r': r' \in [N^{1/4}]\}}$ of length ${\lfloor N^{1/4}\rfloor}$ and spacing ${d^2}$ on which ${A}$ has density at least ${\delta + c\delta^3}$ (and hence at least ${\delta_0+c\delta_0^3}$) for some absolute constant ${c>0}$. If we then apply the induction hypothesis to the set

$\displaystyle A' := \{ r' \in [N^{1/4}]: n' + d^2 r' \in A \}$

we conclude (for ${N}$ large enough) that ${A'}$ contains a pair ${m, m+s^2}$ for some natural numbers ${m,s}$ with ${s}$ non-zero. This implies that ${(n'+d^2 m), (n'+d^2 m) + (|d|s)^2}$ lie in ${A}$, a contradiction, establishing the implication (1).

A more careful analysis of the above argument reveals a more quantitative version of Theorem 1: for ${N \geq 100}$ (say), any subset of ${[N]}$ of density at least ${C/(\log\log N)^{1/2}}$ for some sufficiently large absolute constant ${C}$ contains a pair ${n,n+r^2}$ with ${r}$ non-zero. This is not the best bound known; a (difficult) result of Pintz, Steiger, and Szemeredi allows the density to be as low as ${C / (\log N)^{\frac{1}{4} \log\log\log\log N}}$. On the other hand, this already improves on the (simpler) Fourier-analytic argument of Green that works for densities at least ${C/(\log\log N)^{1/11}}$ (although the original argument of Sarkozy, which is a little more intricate, works up to ${C (\log\log N)^{2/3}/(\log N)^{1/3}}$). In the other direction, a construction of Rusza gives a set of density ${\frac{1}{65} N^{-0.267}}$ without any pairs ${n,n+r^2}$.

Remark 1 A similar argument also applies with ${n,n+r^2}$ replaced by ${n,n+r^k}$ for fixed ${k}$, because this sort of pattern is preserved by affine dilations ${r' \mapsto n'+d^k r'}$ into arithmetic progressions whose spacing ${d^k}$ is a ${k^{th}}$ power. By re-introducing Fourier analysis, one can also perform an argument of this type for ${n,n+d,n+2d}$ where ${d}$ is the sum of two squares; see the above-mentioned paper of Green for details. However there seems to be some technical difficulty in extending it to patterns of the form ${n,n+P(r)}$ for polynomials ${P}$ that consist of more than a single monomial (and with the normalisation ${P(0)=0}$, to avoid local obstructions), because one no longer has this preservation property.