The Riemann zeta function is defined in the region
by the absolutely convergent series
, and thus
For , the series on the right-hand side of (1) is no longer absolutely convergent, or even conditionally convergent. Nevertheless, the
function can be extended to this region (with a pole at
) by analytic continuation. For instance, it can be shown that after analytic continuation, one has
,
, and
, and more generally
, where
are the Bernoulli numbers. If one formally applies (1) at these values of
, one obtains the somewhat bizarre formulae
Clearly, these formulae do not make sense if one stays within the traditional way to evaluate infinite series, and so it seems that one is forced to use the somewhat unintuitive analytic continuation interpretation of such sums to make these formulae rigorous. But as it stands, the formulae look “wrong” for several reasons. Most obviously, the summands on the left are all positive, but the right-hand sides can be zero or negative. A little more subtly, the identities do not appear to be consistent with each other. For instance, if one adds (4) to (5), one obtains
from (5) one obtains instead
However, it is possible to interpret (4), (5), (6) by purely real-variable methods, without recourse to complex analysis methods such as analytic continuation, thus giving an “elementary” interpretation of these sums that only requires undergraduate calculus; we will later also explain how this interpretation deals with the apparent inconsistencies pointed out above.
To see this, let us first consider a convergent sum such as (2). The classical interpretation of this formula is the assertion that the partial sums
converge to as
, or in other words that
where denotes a quantity that goes to zero as
. Actually, by using the integral test estimate
we have the sharper result
Thus we can view as the leading coefficient of the asymptotic expansion of the partial sums of
.
One can then try to inspect the partial sums of the expressions in (4), (5), (6), but the coefficients bear no obvious relationship to the right-hand sides:
For (7), the classical Faulhaber formula (or Bernoulli formula) gives
for , which has a vague resemblance to (7), but again the connection is not particularly clear.
The problem here is the discrete nature of the partial sum
which (if is viewed as a real number) has jump discontinuities at each positive integer value of
. These discontinuities yield various artefacts when trying to approximate this sum by a polynomial in
. (These artefacts also occur in (2), but happen in that case to be obscured in the error term
; but for the divergent sums (4), (5), (6), (7), they are large enough to cause real trouble.)
However, these issues can be resolved by replacing the abruptly truncated partial sums with smoothed sums
, where
is a cutoff function, or more precisely a compactly supported bounded function that equals
at
. The case when
is the indicator function
then corresponds to the traditional partial sums, with all the attendant discretisation artefacts; but if one chooses a smoother cutoff, then these artefacts begin to disappear (or at least become lower order), and the true asymptotic expansion becomes more manifest.
Note that smoothing does not affect the asymptotic value of sums that were already absolutely convergent, thanks to the dominated convergence theorem. For instance, we have
whenever is a cutoff function (since
pointwise as
and is uniformly bounded). If
is equal to
on a neighbourhood of the origin, then the integral test argument then recovers the
decay rate:
However, smoothing can greatly improve the convergence properties of a divergent sum. The simplest example is Grandi’s series
The partial sums
oscillate between and
, and so this series is not conditionally convergent (and certainly not absolutely convergent). However, if one performs analytic continuation on the series
and sets , one obtains a formal value of
for this series. This value can also be obtained by smooth summation. Indeed, for any cutoff function
, we can regroup
If is twice continuously differentiable (i.e.
), then from Taylor expansion we see that the summand has size
, and also (from the compact support of
) is only non-zero when
. This leads to the asymptotic
and so we recover the value of as the leading term of the asymptotic expansion.
Exercise 1 Show that if
is merely once continuously differentiable (i.e.
), then we have a similar asymptotic, but with an error term of
instead of
. This is an instance of a more general principle that smoother cutoffs lead to better error terms, though the improvement sometimes stops after some degree of regularity.
Remark 1 The most famous instance of smoothed summation is Cesáro summation, which corresponds to the cutoff function
. Unsurprisingly, when Cesáro summation is applied to Grandi’s series, one again recovers the value of
.
If we now revisit the divergent series (4), (5), (6), (7) with smooth summation in mind, we finally begin to see the origin of the right-hand sides. Indeed, for any fixed smooth cutoff function , we will shortly show that
where
is the Archimedean factor
). Thus we see that the values (4), (5), (6), (7) obtained by analytic continuation are nothing more than the constant terms of the asymptotic expansion of the smoothed partial sums. This is not a coincidence; we will explain the equivalence of these two interpretations of such sums (in the model case when the analytic continuation has only finitely many poles and does not grow too fast at infinity) below the fold.
This interpretation clears up the apparent inconsistencies alluded to earlier. For instance, the sum consists only of non-negative terms, as does its smoothed partial sums
(if
is non-negative). Comparing this with (13), we see that this forces the highest-order term
to be non-negative (as indeed it is), but does not prohibit the lower-order constant term
from being negative (which of course it is).
Similarly, if we add together (12) and (11) we obtain
from (12) we obtain
, which is invisible in the formal expressions in (8), (9) but become manifestly present in the smoothed sum formulation.
Exercise 2 By Taylor expanding
and using (11), (18) show that (16) and (17) are indeed consistent with each other, and in particular one can deduce the latter from the former.
— 1. Smoothed asymptotics —
We now prove (11), (12), (13), (14). We will prove the first few asymptotics by ad hoc methods, but then switch to the systematic method of the Euler-Maclaurin formula to establish the general case.
For sake of argument we shall assume that the smooth cutoff is supported in the interval
(the general case is similar, and can also be deduced from this case by redefining the
parameter). Thus the sum
is now only non-trivial in the range
.
To establish (11), we shall exploit the trapezoidal rule. For any smooth function , and on an interval
, we see from Taylor expansion that
for any ,
. In particular we have
and
eliminating , we conclude that
Summing in , we conclude the trapezoidal rule
We apply this with , which has a
norm of
from the chain rule, and conclude that
But from (15) and a change of variables, the left-hand side is just . This gives (11).
The same argument does not quite work with (12); one would like to now set , but the
norm is now too large (
instead of
). To get around this we have to refine the trapezoidal rule by performing the more precise Taylor expansion
where . Now we have
and
We cannot simultaneously eliminate both and
. However, using the additional Taylor expansion
one obtains
and thus on summing in , and assuming that
vanishes to second order at
, one has (by telescoping series)
We apply this with . After a few applications of the chain rule and product rule, we see that
; also,
,
, and
. This gives (12).
The proof of (13) is similar. With a fourth order Taylor expansion, the above arguments give
and
Here we have a minor miracle (equivalent to the vanishing of the third Bernoulli number ) that the
term is automatically eliminated when we eliminate the
term, yielding
and thus
With , the left-hand side is
, the first two terms on the right-hand side vanish, and the
norm is
, giving (13).
Now we do the general case (14). We define the Bernoulli numbers recursively by the formula
, or equivalently
The first few values of can then be computed:
From (19) we see that
(with
being the
-fold derivative of
); indeed, (19) is precisely this identity with
, and the general case then follows by linearity.
As (20) holds for all polynomials, it also holds for all formal power series (if we ignore convergence issues). If we then replace by the formal power series
we conclude the formal power series (in ) identity
leading to the familiar generating function
for the Bernoulli numbers.
If we apply (20) with equal to the antiderivative of another polynomial
, we conclude that
which we rearrange as the identity
which can be viewed as a precise version of the trapezoidal rule in the polynomial case. Note that if has degree
, the only the summands with
can be non-vanishing.
Now let be a smooth function. We have a Taylor expansion
for and some polynomial
of degree at most
; also
for and
. We conclude that
Translating this by an arbitrary integer (which does not affect the
norm), we obtain
Summing the telescoping series, and assuming that vanishes to a sufficiently high order at
, we conclude the Euler-Maclaurin formula
. The left-hand side is
. All the terms in the sum vanish except for the
term, which is
. Finally, from many applications of the product rule and chain rule (or by viewing
where
is the smooth function
) we see that
, and the claim (14) follows.
Remark 2 By using a higher regularity norm than the
norm, we see that the error term
can in fact be improved to
for any fixed
, if
is sufficiently smooth.
Exercise 3 Use (21) to derive Faulhaber’s formula (10). Note how the presence of boundary terms at
cause the right-hand side of (10) to be quite different from the right-hand side of (14); thus we see how non-smooth partial summation creates artefacts that can completely obscure the smoothed asymptotics.
— 2. Connection with analytic continuation —
Now we connect the interpretation of divergent series as the constant term of smoothed partial sum asymptotics, with the more traditional interpretation via analytic continuation. For sake of concreteness we shall just discuss the situation with the Riemann zeta function series , though the connection extends to far more general series than just this one.
In the previous section, we have computed asymptotics for the partial sums
when is a negative integer. A key point (which was somewhat glossed over in the above analysis) was that the function
was smooth, even at the origin; this was implicitly used to bound various
norms in the error terms.
Now suppose that is a complex number with
, which is not necessarily a negative integer. Then
becomes singular at the origin, and the above asymptotic analysis is not directly applicable. However, if one instead considers the telescoped partial sum
with equal to
near the origin, then by applying (22) to the function
(which vanishes near the origin, and is now smooth everywhere), we soon obtain the asymptotic
equal to a power of two and summing the telescoping series, one concludes that
which is basically the sum of the various
terms appearing in (23). By modifying the above arguments, it is not difficult to extend this asymptotic to other numbers than powers of two, and to show that
is independent of the choice of cutoff
.
From (24) we have
which can be viewed as a definition of in the region
. For instance, from (14), we have now proven (3) with this definition of
. However it is difficult to compute
exactly for most other values of
.
For each fixed , it is not hard to see that the expression
is complex analytic in
. Also, by a closer inspection of the error terms in the Euler-Maclaurin formula analysis, it is not difficult to show that for
in any compact region of
, these expressions converge uniformly as
. Applying Morera’s theorem, we conclude that our definition of
is complex analytic in the region
.
We still have to connect this definition with the traditional definition (1) of the zeta function on the other half of the complex plane. To do this, we observe that
for large enough. Thus we have
for . The point of doing this is that this definition also makes sense in the region
(due to the absolute convergence of the sum
and integral
. By using the trapezoidal rule, one also sees that this definition makes sense in the region
, with locally uniform convergence there also. So we in fact have a globally complex analytic definition of
, and thus a meromorphic definition of
on the complex plane. Note also that this definition gives the asymptotic
, where
is Euler’s constant.
We have thus seen that asymptotics on smoothed partial sums of gives rise to the familiar meromorphic properties of the Riemann zeta function
. It turns out that by combining the tools of Fourier analysis and complex analysis, one can reverse this procedure and deduce the asymptotics of
from the meromorphic properties of the zeta function.
Let’s see how. Fix a complex number with
, and a smooth cutoff function
which equals one near the origin, and consider the expression
is a large number. We let
be a large number, and rewrite this as
where
The function is in the Schwartz class. By the Fourier inversion formula, it has a Fourier representation
where
and so (26) can be rewritten as
The function is also Schwartz. If
is large enough, we may then interchange the integral and sum and use (1) to rewrite (26) as
Now we have
integrating by parts (which is justified when is large enough) we have
where
We can thus write (26) as a contour integral
Note that is compactly supported away from zero, which makes
an entire function of
, which is uniformly bounded whenever
is bounded. Furthermore, from repeated integration by parts we see that
is rapidly decreasing as
, uniformly for
in a compact set. Meanwhile, standard estimates show that
is of polynomial growth in
for
in a compact set. Finally, the meromorphic function
has a simple pole at
(with residue
) and at
(with residue
). Applying the residue theorem, we can write (26) as
for any . Using the various bounds on
and
, we see that the integral is
. From integration by parts we have
and
and thus we have
for any , which is (14) (with the refined error term indicated in Remark 2).
The above argument reveals that the simple pole of at
is directly connected to the
term in the asymptotics of the smoothed partial sums. More generally, if a Dirichlet series
has a meromorphic continuation to the entire complex plane, and does not grow too fast at infinity, then one (heuristically at least) has the asymptotic
where ranges over the poles of
, and
are the residues at those poles. For instance, one has the famous explicit formula
where is the von Mangoldt function,
are the non-trivial zeroes of the Riemann zeta function (counting multiplicity, if any), and
is an error term (basically arising from the trivial zeroes of zeta); this ultimately reflects the fact that the Dirichlet series
has a simple pole at (with residue
) and simple poles at every zero of the zeta function with residue
(weighted again by multiplicity, though it is not believed that multiple zeroes actually exist).
The link between poles of the zeta function (and its relatives) and asymptotics of (smoothed) partial sums of arithmetical functions can be used to compare elementary methods in analytic number theory with complex methods. Roughly speaking, elementary methods are based on leading term asymptotics of partial sums of arithmetical functions, and are mostly based on exploiting the simple pole of at
(and the lack of a simple zero of Dirichlet
-functions at
); in contrast, complex methods also take full advantage of the zeroes of
and Dirichlet
-functions (or the lack thereof) in the entire complex plane, as well as the functional equation (which, in terms of smoothed partial sums, manifests itself through the Poisson summation formula). Indeed, using the above correspondences it is not hard to see that the prime number theorem (for instance) is equivalent to the lack of zeroes of the Riemann zeta function on the line
.
With this dictionary between elementary methods and complex methods, the Dirichlet hyperbola method in elementary analytic number theory corresponds to analysing the behaviour of poles and residues when multiplying together two Dirichlet series. For instance, by using the formula (11) and the hyperbola method, together with the asymptotic
which can be obtained from the trapezoidal rule and the definition of , one can obtain the asymptotic
where is the divisor function (and in fact one can improve the
bound substantially by being more careful); this corresponds to the fact that the Dirichlet series
has a double pole at with expansion
and no other poles, which of course follows by multiplying (25) with itself.
Remark 3 In the literature, elementary methods in analytic number theorem often use sharply truncated sums rather than smoothed sums. However, as indicated earlier, the error terms tend to be slightly better when working with smoothed sums (although not much gain is obtained in this manner when dealing with sums of functions that are sensitive to the primes, such as
, as the terms arising from the zeroes of the zeta function tend to dominate any saving in this regard).

23 comments
Comments feed for this article
10 April, 2010 at 5:01 pm
Allen Knutson
Do you understand how Planck’s black-body radiation formula “should” be related to Euler-Maclaurin? I don’t think I’m even asking the question correctly, but confident you can deal with that part.
10 April, 2010 at 8:44 pm
Terence Tao
Ah, I remember discussing these sorts of questions back in grad school :-)
Well, ostensibly most visible connection between the two is that the expression
appears in both. OK, let’s try deconstructing this, starting with Euler-Maclaurin. One way to formally derive it (which I didn’t emphasise above, as I wanted to keep things fairly elementary, and also rigorous) is to start with the Taylor expansion, which one can write as
where
is the derivative operator. In particular
Using the power series expansion
mentioned in the post and using Taylor expansion and the fundamental theorem of calculus (to invert
) one can then quickly obtain a formal derivation of the Euler-Maclaurin formula.
Now for black-body radiation. The key calculation here involves the expected particle number of a boson gas at frequency
. Ignoring vacuum energy issues, a n-particle state here should have energy
. At temperature
, the proportion of the state with particle number n is thus something like
, where
is the partition function at this frequency (ignoring state multiplicity arising from spin). Summing the geometric-like series, we see that the expected particle number
is then something like
, which then appears in the black-body radiation law. (In three dimensions, one then gets two extra powers of
coming from polar coordinates as
ranges over
or a discretised version thereof.)
In one dimension, the frequency
is associated to the operator
(this is just reflecting the relationship
). So the energy is associated to
(this reflects the relation
) and so the partition function formally resembles something like
. So I guess this is a sort of Wick rotation of the summation
that appears in Euler-Maclaurin. The expected particle number
can be viewed as a variant of the partition function. I guess if one is working in the low frequency regime then one can then Taylor expand in D and see a series with Bernoulli number coefficients appear.
To summarise: the low-frequency expansion of the black body radiation formula can be derived from a variant of a higher-dimensional Wick-rotated version of Euler-Maclaurin. It’s sort of a six-degrees-of-separation thing, but I guess there is some slight connection.
10 April, 2010 at 11:43 pm
Anonymous
Should there be an expository tag? [Added, thanks - T.]
11 April, 2010 at 12:04 am
Américo Tavares
Small typo between (2) and (3)
[Corrected, thanks - T.]
13 April, 2010 at 10:53 pm
Bryan Jacobs
typo is back!
[Corrected (again), thanks. - T.]
12 April, 2010 at 6:31 am
Mark Meckes
Typo: after “integral test estimate” the first integral sign should be a summation.
[Corrected, thanks - T.]
12 April, 2010 at 8:57 pm
Jamal
YOU ARE GREAT! How can I learn mathematics by hard byhard
13 April, 2010 at 12:17 am
Bo Jacoby
Thanks!
Shouldn’t formula (8)



read
to avoid confusion with
[Corrected, thanks - T.]
The expression


should be understood as
Otherwise the expression depend on N.
Likewise the upper summation limit in formulas 11-12-13-14-16-17-18 should be N rather than \infty.
The formula
\displaystyle \sum_{n=1}^\infty (-1)^{n-1} \eta(n/N) = \frac{1}{2} + \sum_{m=1}^\infty \frac{\eta((2m-1)/N) – 2\eta(2m/N) + \eta((2m+1)/N)}{2}.
must be split into two lines in order to display correctly.
13 April, 2010 at 12:43 am
Bo Jacoby
I regret my second comment above. Now I got it.
15 April, 2010 at 11:14 am
Javier
Pf. Tao,
I hope you do not shoot messengers bearing bad news!
I noticed equation 19 is incorrect. The series in equation (19) is equal to zero (I find it incredible no one has made a comment about this). Therefore, the recursion formula you suggest is incorrect. (I tested the recursion for the first few Bernoulli numbers, and I also checked in wolfram website in http://mathworld.wolfram.com/BernoulliNumber.html eq. #34). Therefore, the basis of your development in this blog seems suspect to me since equation 20 has no basis for polynomials unless you replace the first derivative evaluated at one to evaluation at zero for P(x;s) = x^s. It seem interesting to me that yet equation 20 is valid for P(x;t) = exp(xt). I think this has to do with the fact that P(0;t) is identical to one.
Sincerely,
Javier
15 April, 2010 at 1:04 pm
Terence Tao
Dear Javier,
I believe you are using the alternate definition of the Bernoulli numbers, in which
is set to -1/2 rather than 1/2. There does not appear to be a consensus on which one is the canonical definition; both definitions have their own (minor) advantages. For instance, with the
convention, the formula (7) can be extended to the case s=0.
15 April, 2010 at 2:11 pm
Javier
Dear Pf. Tao,
I must admit do not have a math degree, but I love mathematics because of it’s beauty. I am a test engineer by trade and therefore like to test the mathematical ideas I read. I thought B_1 = -1/2 was the “original” Bernoulli number ;-) Do you have a reference for the “extra crispy” definition of the Bernoulli, B_1 = 1/2?
I think I understand your article’s motivation: you are trying to create meaning to non-convergent series by applying a smoothing function to give a reasonable values to functions with integral representations (which can be analytically continued in the complex domain), thereby using the “asymptotic series” as a quick way to calculate values of these functions, such as the Riemann Zeta function (RZ). Is this correct? If this is the case, could this “approximation” method give an advantage in calculating the zeros of RZ ? I do not know, but my intuition tells me no.
Again thanks for the clarification and not shooting the messenger :-)
Javier
28 May, 2010 at 10:00 am
Divergent sums and the class number formula « Secret Blogging Seminar
[...] why Orde’s argument works. Specifically, I am going to use an idea which I learned from Terry Tao’s blog: Arguments about divergent sums are often really arguments about the constant term in asymptotic [...]
29 May, 2010 at 12:03 am
Jeremy Williams
Before learning of Bernoulli Numbers, I myself have derived a closed formula for the summation of consecutive integers raised to the i power. Within this fomula I derived another closed formula for computing Bernoulli Numbers.
After this, I derived a much more efficient formula for generating the summation of consecutive integers riased to the i power. If you are interest in looking at my work (in pdf format) please reply to this post, and I will email it to you.
29 May, 2010 at 12:05 am
Jeremy Williams
by the way, all of my computations and derivations give B(1) = +1/2
18 August, 2010 at 5:25 pm
Dan Christensen
In the discussion of Grandi’s series, in the displayed equation after
“Indeed, for any cutoff function
, we can regroup”
shouldn’t the leading term
actually be
?
[Corrected, thanks - T.]
3 September, 2010 at 2:29 pm
tereta
I found my question,thanks”
19 September, 2010 at 4:54 am
Max Atkin
Dear Prof. Tao,
I found this article a very interesting and fun read. I was wondering if there exists a more precise statement of the following (which appears just before you mention the “explicit formula” );
“More generally, if a Dirichlet series
has a meromorphic continuation to the entire complex plane, and does not grow too fast at infinity, then one (heuristically at least) has the asymptotic
where
ranges over the poles of
, and
are the residues at those poles.”
In particular, I would be interested to know what qualifications are hiding in “heuristically at least” and “does not grow too fast at infinity” – is there a theorem that states this somewhere?
Thanks,
Max
19 September, 2010 at 8:17 am
Terence Tao
Well, I do not know of a precise reference, but most texts in analytic number theory would have some results along these lines, perhaps specialised to specific Dirichlet series such as the one for 1/zeta(s) or zeta’(s)/zeta(s). But the basic idea is simply to mimic the discussion given previously for the
case to whatever Dirichlet series one is studying.
21 September, 2010 at 1:55 pm
Max Atkin
Thank you for taking the time to reply. It sounds as if this is something that is approached on a case by case basis rather than proving a set of properties that a_n must satisfy in order for an “explicit formula” to exist. Is this because the latter approach is too hard?
21 September, 2010 at 2:50 pm
Terence Tao
It’s more an issue of non-uniqueness. There are many different choices for what hypotheses to place here, and various conclusions one could reach (involving different hypotheses on the cutoff
, different bounds on the error term, etc.). [This is common to many principles in analysis; they are not easily formalised by a "one-size-fits-all" theorem, in contrast to the more algebraic portions of mathematics, but instead need to be tailored to each separate application.]
Also, in most number-theoretic applications, the Dirichlet series in question obeys a functional equation, which can be used to strengthen the asymptotic to a significantly greater extent than can be done for a generic Dirichlet series. So it isn’t all that worthwhile to write down a specific instantiation of the heuristic formula until it is actually needed, other than as a useful exercise to test if one understands how such formulae would be derived.
6 October, 2010 at 8:49 am
Yet Another Article to Read During Break | What's Up
[...] From Terry Tao’s blog. [...]
23 July, 2011 at 7:40 pm
Erdos’ divisor bound « What’s new
[...] sharper bounds available by using tools such as the Euler-Maclaurin formula (see this blog post). Exponentiating such asymptotics, incidentally, leads to one of the standard proofs of [...]