You are currently browsing the tag archive for the ‘Polya-Vinogradov inequality’ tag.
A large portion of analytic number theory is concerned with the distribution of number-theoretic sets such as the primes, or quadratic residues in a certain modulus. At a local level (e.g. on a short interval ), the behaviour of these sets may be quite irregular. However, in many cases one can understand the global behaviour of such sets on very large intervals, (e.g. ), with reasonable accuracy (particularly if one assumes powerful additional conjectures, such as the Riemann hypothesis and its generalisations). For instance, in the case of the primes, we have the prime number theorem, which asserts that the number of primes in a large interval is asymptotically equal to ; in the case of quadratic residues modulo a prime , it is clear that there are exactly such residues in . With elementary arguments, one can also count statistics such as the number of pairs of consecutive quadratic residues; and with the aid of deeper tools such as the Weil sum estimates, one can count more complex patterns in these residues also (e.g. -point correlations).
One is often interested in converting this sort of “global” information on long intervals into “local” information on short intervals. If one is interested in the behaviour on a generic or average short interval, then the question is still essentially a global one, basically because one can view a long interval as an average of a long sequence of short intervals. (This does not mean that the problem is automatically easy, because not every global statistic about, say, the primes is understood. For instance, we do not know how to rigorously establish the conjectured asymptotic for the number of twin primes in a long interval , and so we do not fully understand the local distribution of the primes in a typical short interval .)
However, suppose that instead of understanding the average-case behaviour of short intervals, one wants to control the worst-case behaviour of such intervals (i.e. to establish bounds that hold for all short intervals, rather than most short intervals). Then it becomes substantially harder to convert global information to local information. In many cases one encounters a “square root barrier”, in which global information at scale (e.g. statistics on ) cannot be used to say anything non-trivial about a fixed (and possibly worst-case) short interval at scales or below. (Here we ignore factors of for simplicity.) The basic reason for this is that even randomly distributed sets in (which are basically the most uniform type of global distribution one could hope for) exhibit random fluctuations of size or so in their global statistics (as can be seen for instance from the central limit theorem). Because of this, one could take a random (or pseudorandom) subset of and delete all the elements in a short interval of length , without anything suspicious showing up on the global statistics level; the edited set still has essentially the same global statistics as the original set. On the other hand, the worst-case behaviour of this set on a short interval has been drastically altered.
One stark example of this arises when trying to control the largest gap between consecutive prime numbers in a large interval . There are convincing heuristics that suggest that this largest gap is of size (Cramér’s conjecture). But even assuming the Riemann hypothesis, the best upper bound on this gap is only of size , basically because of this square root barrier. This particular instance of the square root barrier is a significant obstruction to the current polymath project “Finding primes“.
On the other hand, in some cases one can use additional tricks to get past the square root barrier. The key point is that many number-theoretic sequences have special structure that distinguish them from being exactly like random sets. For instance, quadratic residues have the basic but fundamental property that the product of two quadratic residues is again a quadratic residue. One way to use this sort of structure to amplify bad behaviour in a single short interval into bad behaviour across many short intervals. Because of this amplification, one can sometimes get new worst-case bounds by tapping the average-case bounds.
In this post I would like to indicate a classical example of this type of amplification trick, namely Burgess’s bound on short character sums. To narrow the discussion, I would like to focus primarily on the following classical problem:
Problem 1 What are the best bounds one can place on the first quadratic non-residue in the interval for a large prime ?
(The first quadratic residue is, of course, ; the more interesting problem is the first quadratic non-residue.)
Probabilistic heuristics (presuming that each non-square integer has a 50-50 chance of being a quadratic residue) suggests that should have size , and indeed Vinogradov conjectured that for any . Using the Pólya-Vinogradov inequality, one can get the bound (and can improve it to using smoothed sums); combining this with a sieve theory argument (exploiting the multiplicative nature of quadratic residues) one can boost this to . Inserting Burgess’s amplification trick one can boost this to for any . Apart from refinements to the factor, this bound has stood for five decades as the “world record” for this problem, which is a testament to the difficulty in breaching the square root barrier.
Note: in order not to obscure the presentation with technical details, I will be using asymptotic notation in a somewhat informal manner.