You are currently browsing the tag archive for the ‘Xuancheng Shao’ tag.

Kaisa Matomäki, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Higher uniformity of arithmetic functions in short intervals I. All intervals“. This paper investigates the higher order (Gowers) uniformity of standard arithmetic functions in analytic number theory (and specifically, the Möbius function {\mu}, the von Mangoldt function {\Lambda}, and the generalised divisor functions {d_k}) in short intervals {(X,X+H]}, where {X} is large and {H} lies in the range {X^{\theta+\varepsilon} \leq H \leq X^{1-\varepsilon}} for a fixed constant {0 < \theta < 1} (that one would like to be as small as possible). If we let {f} denote one of the functions {\mu, \Lambda, d_k}, then there is extensive literature on the estimation of short sums

\displaystyle  \sum_{X < n \leq X+H} f(n)

and some literature also on the estimation of exponential sums such as

\displaystyle  \sum_{X < n \leq X+H} f(n) e(-\alpha n)

for a real frequency {\alpha}, where {e(\theta) := e^{2\pi i \theta}}. For applications in the additive combinatorics of such functions {f}, it is also necessary to consider more general correlations, such as polynomial correlations

\displaystyle  \sum_{X < n \leq X+H} f(n) e(-P(n))

where {P: {\bf Z} \rightarrow {\bf R}} is a polynomial of some fixed degree, or more generally

\displaystyle  \sum_{X < n \leq X+H} f(n) \overline{F}(g(n) \Gamma)

where {G/\Gamma} is a nilmanifold of fixed degree and dimension (and with some control on structure constants), {g: {\bf Z} \rightarrow G} is a polynomial map, and {F: G/\Gamma \rightarrow {\bf C}} is a Lipschitz function (with some bound on the Lipschitz constant). Indeed, thanks to the inverse theorem for the Gowers uniformity norm, such correlations let one control the Gowers uniformity norm of {f} (possibly after subtracting off some renormalising factor) on such short intervals {(X,X+H]}, which can in turn be used to control other multilinear correlations involving such functions.

Traditionally, asymptotics for such sums are expressed in terms of a “main term” of some arithmetic nature, plus an error term that is estimated in magnitude. For instance, a sum such as {\sum_{X < n \leq X+H} \Lambda(n) e(-\alpha n)} would be approximated in terms of a main term that vanished (or is negligible) if {\alpha} is “minor arc”, but would be expressible in terms of something like a Ramanujan sum if {\alpha} was “major arc”, together with an error term. We found it convenient to cancel off such main terms by subtracting an approximant {f^\sharp} from each of the arithmetic functions {f} and then getting upper bounds on remainder correlations such as

\displaystyle  |\sum_{X < n \leq X+H} (f(n)-f^\sharp(n)) \overline{F}(g(n) \Gamma)| \ \ \ \ \ (1)

(actually for technical reasons we also allow the {n} variable to be restricted further to a subprogression of {(X,X+H]}, but let us ignore this minor extension for this discussion). There is some flexibility in how to choose these approximants, but we eventually found it convenient to use the following choices.

  • For the Möbius function {\mu}, we simply set {\mu^\sharp = 0}, as per the Möbius pseudorandomness conjecture. (One could choose a more sophisticated approximant in the presence of a Siegel zero, as I did with Joni in this recent paper, but we do not do so here.)
  • For the von Mangoldt function {\Lambda}, we eventually went with the Cramér-Granville approximant {\Lambda^\sharp(n) = \frac{W}{\phi(W)} 1_{(n,W)=1}}, where {W = \prod_{p < R} p} and {R = \exp(\log^{1/10} X)}.
  • For the divisor functions {d_k}, we used a somewhat complicated-looking approximant {d_k^\sharp(n) = \sum_{m \leq X^{\frac{k-1}{5k}}} P_m(\log n)} for some explicit polynomials {P_m}, chosen so that {d_k^\sharp} and {d_k} have almost exactly the same sums along arithmetic progressions (see the paper for details).

The objective is then to obtain bounds on sums such as (1) that improve upon the “trivial bound” that one can get with the triangle inequality and standard number theory bounds such as the Brun-Titchmarsh inequality. For {\mu} and {\Lambda}, the Siegel-Walfisz theorem suggests that it is reasonable to expect error terms that have “strongly logarithmic savings” in the sense that they gain a factor of {O_A(\log^{-A} X)} over the trivial bound for any {A>0}; for {d_k}, the Dirichlet hyperbola method suggests instead that one has “power savings” in that one should gain a factor of {X^{-c_k}} over the trivial bound for some {c_k>0}. In the case of the Möbius function {\mu}, there is an additional trick (introduced by Matomäki and Teräväinen) that allows one to lower the exponent {\theta} somewhat at the cost of only obtaining “weakly logarithmic savings” of shape {\log^{-c} X} for some small {c>0}.

Our main estimates on sums of the form (1) work in the following ranges:

  • For {\theta=5/8}, one can obtain strongly logarithmic savings on (1) for {f=\mu,\Lambda}, and power savings for {f=d_k}.
  • For {\theta=3/5}, one can obtain weakly logarithmic savings for {f = \mu, d_k}.
  • For {\theta=5/9}, one can obtain power savings for {f=d_3}.
  • For {\theta=1/3}, one can obtain power savings for {f=d_2}.

Conjecturally, one should be able to obtain power savings in all cases, and lower {\theta} down to zero, but the ranges of exponents and savings given here seem to be the limit of current methods unless one assumes additional hypotheses, such as GRH. The {\theta=5/8} result for correlation against Fourier phases {e(\alpha n)} was established previously by Zhan, and the {\theta=3/5} result for such phases and {f=\mu} was established previously by by Matomäki and Teräväinen.

By combining these results with tools from additive combinatorics, one can obtain a number of applications:

  • Direct insertion of our bounds in the recent work of Kanigowski, Lemanczyk, and Radziwill on the prime number theorem on dynamical systems that are analytic skew products gives some improvements in the exponents there.
  • We can obtain a “short interval” version of a multiple ergodic theorem along primes established by Frantzikinakis-Host-Kra and Wooley-Ziegler, in which we average over intervals of the form {(X,X+H]} rather than {[1,X]}.
  • We can obtain a “short interval” version of the “linear equations in primes” asymptotics obtained by Ben Green, Tamar Ziegler, and myself in this sequence of papers, where the variables in these equations lie in short intervals {(X,X+H]} rather than long intervals such as {[1,X]}.

We now briefly discuss some of the ingredients of proof of our main results. The first step is standard, using combinatorial decompositions (based on the Heath-Brown identity and (for the {\theta=3/5} result) the Ramaré identity) to decompose {\mu(n), \Lambda(n), d_k(n)} into more tractable sums of the following types:

  • Type {I} sums, which are basically of the form {\sum_{m \leq A:m|n} \alpha(m)} for some weights {\alpha(m)} of controlled size and some cutoff {A} that is not too large;
  • Type {II} sums, which are basically of the form {\sum_{A_- \leq m \leq A_+:m|n} \alpha(m)\beta(n/m)} for some weights {\alpha(m)}, {\beta(n)} of controlled size and some cutoffs {A_-, A_+} that are not too close to {1} or to {X};
  • Type {I_2} sums, which are basically of the form {\sum_{m \leq A:m|n} \alpha(m) d_2(n/m)} for some weights {\alpha(m)} of controlled size and some cutoff {A} that is not too large.

The precise ranges of the cutoffs {A, A_-, A_+} depend on the choice of {\theta}; our methods fail once these cutoffs pass a certain threshold, and this is the reason for the exponents {\theta} being what they are in our main results.

The Type {I} sums involving nilsequences can be treated by methods similar to those in this previous paper of Ben Green and myself; the main innovations are in the treatment of the Type {II} and Type {I_2} sums.

For the Type {II} sums, one can split into the “abelian” case in which (after some Fourier decomposition) the nilsequence {F(g(n)\Gamma)} is basically of the form {e(P(n))}, and the “non-abelian” case in which {G} is non-abelian and {F} exhibits non-trivial oscillation in a central direction. In the abelian case we can adapt arguments of Matomaki and Shao, which uses Cauchy-Schwarz and the equidistribution properties of polynomials to obtain good bounds unless {e(P(n))} is “major arc” in the sense that it resembles (or “pretends to be”) {\chi(n) n^{it}} for some Dirichlet character {\chi} and some frequency {t}, but in this case one can use classical multiplicative methods to control the correlation. It turns out that the non-abelian case can be treated similarly. After applying Cauchy-Schwarz, one ends up analyzing the equidistribution of the four-variable polynomial sequence

\displaystyle  (n,m,n',m') \mapsto (g(nm)\Gamma, g(n'm)\Gamma, g(nm') \Gamma, g(n'm'\Gamma))

as {n,m,n',m'} range in various dyadic intervals. Using the known multidimensional equidistribution theory of polynomial maps in nilmanifolds, one can eventually show in the non-abelian case that this sequence either has enough equidistribution to give cancellation, or else the nilsequence involved can be replaced with one from a lower dimensional nilmanifold, in which case one can apply an induction hypothesis.

For the type {I_2} sum, a model sum to study is

\displaystyle  \sum_{X < n \leq X+H} d_2(n) e(\alpha n)

which one can expand as

\displaystyle  \sum_{n,m: X < nm \leq X+H} e(\alpha nm).

We experimented with a number of ways to treat this type of sum (including automorphic form methods, or methods based on the Voronoi formula or van der Corput’s inequality), but somewhat to our surprise, the most efficient approach was an elementary one, in which one uses the Dirichlet approximation theorem to decompose the hyperbolic region {\{ (n,m) \in {\bf N}^2: X < nm \leq X+H \}} into a number of arithmetic progressions, and then uses equidistribution theory to establish cancellation of sequences such as {e(\alpha nm)} on the majority of these progressions. As it turns out, this strategy works well in the regime {H > X^{1/3+\varepsilon}} unless the nilsequence involved is “major arc”, but the latter case is treatable by existing methods as discussed previously; this is why the {\theta} exponent for our {d_2} result can be as low as {1/3}.

In a sequel to this paper (currently in preparation), we will obtain analogous results for almost all intervals {(x,x+H]} with {x} in the range {[X,2X]}, in which we will be able to lower {\theta} all the way to {0}.

Kaisa Matomäki, Maksym Radziwill, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Singmaster’s conjecture in the interior of Pascal’s triangle“. This paper leverages the theory of exponential sums over primes to make progress on a well known conjecture of Singmaster which asserts that any natural number larger than {1} appears at most a bounded number of times in Pascal’s triangle. That is to say, for any integer {t \geq 2}, there are at most {O(1)} solutions to the equation

\displaystyle  \binom{n}{m} = t \ \ \ \ \ (1)

with {1 \leq m < n}. Currently, the largest number of solutions that is known to be attainable is eight, with {t} equal to

\displaystyle  3003 = \binom{3003}{1} = \binom{78}{2} = \binom{15}{5} = \binom{14}{6} = \binom{14}{8} = \binom{15}{10}

\displaystyle = \binom{78}{76} = \binom{3003}{3002}.

Because of the symmetry {\binom{n}{m} = \binom{n}{n-m}} of Pascal’s triangle it is natural to restrict attention to the left half {1 \leq m \leq n/2} of the triangle.

Our main result settles this conjecture in the “interior” region of the triangle:

Theorem 1 (Singmaster’s conjecture in the interior of the triangle) If {0 < \varepsilon < 1} and {t} is sufficiently large depending on {\varepsilon}, there are at most two solutions to (1) in the region

\displaystyle  \exp( \log^{2/3+\varepsilon} n ) \leq m \leq n/2 \ \ \ \ \ (2)

and hence at most four in the region

\displaystyle  \exp( \log^{2/3+\varepsilon} n ) \leq m \leq n - \exp( \log^{2/3+\varepsilon} n ).

Also, there is at most one solution in the region

\displaystyle  \exp( \log^{2/3+\varepsilon} n ) \leq m \leq n/\exp(\log^{1-\varepsilon} n ).

To verify Singmaster’s conjecture in full, it thus suffices in view of this result to verify the conjecture in the boundary region

\displaystyle  2 \leq m < \exp(\log^{2/3+\varepsilon} n) \ \ \ \ \ (3)

(or equivalently {n - \exp(\log^{2/3+\varepsilon} n) < m \leq n}); we have deleted the {m=1} case as it of course automatically supplies exactly one solution to (1). It is in fact possible that for {t} sufficiently large there are no further collisions {\binom{n}{m} = \binom{n'}{m'}=t} for {(n,m), (n',m')} in the region (3), in which case there would never be more than eight solutions to (1) for sufficiently large {t}. This is latter claim known for bounded values of {m,m'} by Beukers, Shorey, and Tildeman, with the main tool used being Siegel’s theorem on integral points.

The upper bound of two here for the number of solutions in the region (2) is best possible, due to the infinite family of solutions to the equation

\displaystyle  \binom{n+1}{m+1} = \binom{n}{m+2} \ \ \ \ \ (4)

coming from {n = F_{2j+2} F_{2j+3}-1}, {m = F_{2j} F_{2j+3}-1} and {F_j} is the {j^{th}} Fibonacci number.

The appearance of the quantity {\exp( \log^{2/3+\varepsilon} n )} in Theorem 1 may be familiar to readers that are acquainted with Vinogradov’s bounds on exponential sums, which ends up being the main new ingredient in our arguments. In principle this threshold could be lowered if we had stronger bounds on exponential sums.

To try to control solutions to (1) we use a combination of “Archimedean” and “non-Archimedean” approaches. In the “Archimedean” approach (following earlier work of Kane on this problem) we view {n,m} primarily as real numbers rather than integers, and express (1) in terms of the Gamma function as

\displaystyle  \frac{\Gamma(n+1)}{\Gamma(m+1) \Gamma(n-m+1)} = t.

One can use this equation to solve for {n} in terms of {m,t} as

\displaystyle  n = f_t(m)

for a certain real analytic function {f_t} whose asymptotics are easily computable (for instance one has the asymptotic {f_t(m) \asymp m t^{1/m}}). One can then view the problem as one of trying to control the number of lattice points on the graph {\{ (m,f_t(m)): m \in {\bf R} \}}. Here we can take advantage of the fact that in the regime {m \leq f_t(m)/2} (which corresponds to working in the left half {m \leq n/2} of Pascal’s triangle), the function {f_t} can be shown to be convex, but not too convex, in the sense that one has both upper and lower bounds on the second derivative of {f_t} (in fact one can show that {f''_t(m) \asymp f_t(m) (\log t/m^2)^2}). This can be used to preclude the possibility of having a cluster of three or more nearby lattice points on the graph {\{ (m,f_t(m)): m \in {\bf R} \}}, basically because the area subtended by the triangle connecting three of these points would lie between {0} and {1/2}, contradicting Pick’s theorem. Developing these ideas, we were able to show

Proposition 2 Let {\varepsilon>0}, and suppose {t} is sufficiently large depending on {\varepsilon}. If {(m,n)} is a solution to (1) in the left half {m \leq n/2} of Pascal’s triangle, then there is at most one other solution {(m',n')} to this equation in the left half with

\displaystyle  |m-m'| + |n-n'| \ll \exp( (\log\log t)^{1-\varepsilon} ).

Again, the example of (4) shows that a cluster of two solutions is certainly possible; the convexity argument only kicks in once one has a cluster of three or more solutions.

To finish the proof of Theorem 1, one has to show that any two solutions {(m,n), (m',n')} to (1) in the region of interest must be close enough for the above proposition to apply. Here we switch to the “non-Archimedean” approach, in which we look at the {p}-adic valuations {\nu_p( \binom{n}{m} )} of the binomial coefficients, defined as the number of times a prime {p} divides {\binom{n}{m}}. From the fundamental theorem of arithmetic, a collision

\displaystyle  \binom{n}{m} = \binom{n'}{m'}

between binomial coefficients occurs if and only if one has agreement of valuations

\displaystyle  \nu_p( \binom{n}{m} ) = \nu_p( \binom{n'}{m'} ). \ \ \ \ \ (5)

From the Legendre formula

\displaystyle  \nu_p(n!) = \sum_{j=1}^\infty \lfloor \frac{n}{p^j} \rfloor

we can rewrite this latter identity (5) as

\displaystyle  \sum_{j=1}^\infty \{ \frac{m}{p^j} \} + \{ \frac{n-m}{p^j} \} - \{ \frac{n}{p^j} \} = \sum_{j=1}^\infty \{ \frac{m'}{p^j} \} + \{ \frac{n'-m'}{p^j} \} - \{ \frac{n'}{p^j} \}, \ \ \ \ \ (6)

where {\{x\} := x - \lfloor x\rfloor} denotes the fractional part of {x}. (These sums are not truly infinite, because the summands vanish once {p^j} is larger than {\max(n,n')}.)

A key idea in our approach is to view this condition (6) statistically, for instance by viewing {p} as a prime drawn randomly from an interval such as {[P, P + P \log^{-100} P]} for some suitably chosen scale parameter {P}, so that the two sides of (6) now become random variables. It then becomes advantageous to compare correlations between these two random variables and some additional test random variable. For instance, if {n} and {n'} are far apart from each other, then one would expect the left-hand side of (6) to have a higher correlation with the fractional part {\{ \frac{n}{p}\}}, since this term shows up in the summation on the left-hand side but not the right. Similarly if {m} and {m'} are far apart from each other (although there are some annoying cases one has to treat separately when there is some “unexpected commensurability”, for instance if {n'-m'} is a rational multiple of {m} where the rational has bounded numerator and denominator). In order to execute this strategy, it turns out (after some standard Fourier expansion) that one needs to get good control on exponential sums such as

\displaystyle  \sum_{P \leq p \leq P + P\log^{-100} P} e( \frac{N}{p} + \frac{M}{p^j} )

for various choices of parameters {P, N, M, j}, where {e(\theta) := e^{2\pi i \theta}}. Fortunately, the methods of Vinogradov (which more generally can handle sums such as {\sum_{n \in I} e(f(n))} and {\sum_{p \in I} e(f(p))} for various analytic functions {f}) can give useful bounds on such sums as long as {N} and {M} are not too large compared to {P}; more specifically, Vinogradov’s estimates are non-trivial in the regime {N,M \ll \exp( \log^{3/2-\varepsilon} P )}, and this ultimately leads to a distance bound

\displaystyle  m' - m \ll_\varepsilon \exp( \log^{2/3 +\varepsilon}(n+n') )

between any colliding pair {(n,m), (n',m')} in the left half of Pascal’s triangle, as well as the variant bound

\displaystyle  n' - n \ll_\varepsilon \exp( \log^{2/3 +\varepsilon}(n+n') )

under the additional assumption

\displaystyle  m', m \geq \exp( \log^{2/3 +\varepsilon}(n+n') ).

Comparing these bounds with Proposition 2 and using some basic estimates about the function {f_t}, we can conclude Theorem 1.

A modification of the arguments also gives similar results for the equation

\displaystyle  (n)_m = t \ \ \ \ \ (7)

where {(n)_m := n (n-1) \dots (n-m+1)} is the falling factorial:

Theorem 3 If {0 < \varepsilon < 1} and {t} is sufficiently large depending on {\varepsilon}, there are at most two solutions to (7) in the region

\displaystyle  \exp( \log^{2/3+\varepsilon} n ) \leq m < n. \ \ \ \ \ (8)

Again the upper bound of two is best possible, thanks to identities such as

\displaystyle  (a^2-a)_{a^2-2a} = (a^2-a-1)_{a^2-2a+1}.