If , a Poisson random variable with mean is a random variable taking values in the natural numbers with probability distribution

One is often interested in bounding upper tail probabilities for , or lower tail probabilities for . A standard tool for this is Bennett’s inequality:

Proposition 1 (Bennett’s inequality)One has for and for , where

From the Taylor expansion for we conclude Gaussian type tail bounds in the regime (and in particular when (in the spirit of the Chernoff, Bernstein, and Hoeffding inequalities). but in the regime where is large and positive one obtains a slight gain over these other classical bounds (of type, rather than ).

*Proof:* We use the exponential moment method. For any , we have from Markov’s inequality that

Remark 2Bennett’s inequality also applies for (suitably normalized) sums of bounded independent random variables. In some cases there are direct comparison inequalities available to relate those variables to the Poisson case. For instance, suppose is the sum of independent Boolean variables of total mean and with for some . Then for any natural number , we have As such, for small, one can efficiently control the tail probabilities of in terms of the tail probability of a Poisson random variable of mean close to ; this is of course very closely related to the well known fact that the Poisson distribution emerges as the limit of sums of many independent boolean variables, each of which is non-zero with small probability. See this paper of Bentkus and this paper of Pinelis for some further useful (and less obvious) comparison inequalities of this type.

In this note I wanted to record the observation that one can improve the Bennett bound by a small polynomial factor once one leaves the Gaussian regime , in particular gaining a factor of when . This observation is not difficult and is implicitly in the literature (one can extract it for instance from the much more general results of this paper of Talagrand, and the basic idea already appears in this paper of Glynn), but I was not able to find a clean version of this statement in the literature, so I am placing it here on my blog. (But if a reader knows of a reference that basically contains the bound below, I would be happy to know of it.)

Proposition 3 (Improved Bennett’s inequality)One has for and for .

*Proof:* We begin with the first inequality. We may assume that , since otherwise the claim follows from the usual Bennett inequality. We expand out the left-hand side as

Now we turn to the second inequality. As before we may assume that . We first dispose of a degenerate case in which . Here the left-hand side is just

and the right-hand side is comparable to Since is negative and , we see that the right-hand side is , and the estimate holds in this case.It remains to consider the regime where and . The left-hand side expands as

The sum is dominated by the first term times a geometric series . The maximal is comparable to , so we can bound the left-hand side by Using the Stirling approximation as before we can bound this by which simplifies to after a routine calculation.The same analysis can be reversed to show that the bounds given above are basically sharp up to constants, at least when (and ) are large.

## 12 comments

Comments feed for this article

13 December, 2022 at 11:54 am

AnonymousIs there a known characterization of the class of distributions for which Bennett’s inequality is sharp?

14 December, 2022 at 10:37 am

Terence TaoI’m pretty sure Bennett’s inequality is almost never going to be completely sharp – the step in the proof where Markov’s inequality is applied is always going to lose something (since the Poisson random variable – or whatever other variable one is applying the inequality to – is not going to be a step function random variable). The main advantage of the Bennett inequality is that it has a clean statement and is reasonably close to optimal, but the point is that there is still a little bit of room for improvement. (There are several other improved versions of Bennett’s inequality in the literature, though so far I have not been able to use those to obtain a proof of Proposition 3 that was shorter than the direct calculation provided here.)

15 December, 2022 at 12:56 pm

AnonymousIs the exponent of the gain factor the best possible?

16 December, 2022 at 9:08 am

Terence TaoYes (staying in the regime for simplicity). For instance, suppose that is an integer and . Then

by the Stirling approximation.

13 December, 2022 at 7:39 pm

Xiaohui ChenDear Prof. Tao,

I am not sure if there is an exact reference containing your sharpened bound than the standard Bennett’s inequality. However, your bound reminds me the connection to the standard Gaussian tail probability bound , which leverages an integration by parts argument that can be generalized to the Poisson distribution.

Let be a random variable, my key observation is to note that we can write

for any .

Using summation by parts (where the Poisson tail goes to zero sub-exponentially fast), we can write the tail probability as

Note that the second term on the RHS of the last expression is negative, so we can drop it to yield the following bound:

Using the above relation again, we have

Then,

Now, running the same Stirling's approximation as in your argument to simplify the RHS of the last inequality, we obtain your upper bound in Proposition 3.

Similar lower bound can be deduced with this argument.

I think we can run the summation of parts argument to get even higher-order refinement of the standard Bennett's inequality with improvement of a factor as O(poly(\lambda)) in non-Gaussian regime.

Thus, I believe your bounds can be interpreted as a discrete version of the Mills ratio in the standard Gaussian distribution case.

14 December, 2022 at 7:05 am

Upper bounds on left and right tails of Poisson probability[…] Terence Tao published a blog post on bounds for the Poisson probability distribution. Specifically, he wrote about Bennett’s […]

14 December, 2022 at 7:57 am

AnonymousTao’s ego is too big. Stop doing self-promotion and start doing relevant mathematics.

22 December, 2022 at 3:52 am

Bert ZwartThanks for this. I would like to point towards other bounds for the Poisson distribution, capitalizing upon the connection with the Lambert W function. We also noticed a connection with a conjecture of Ramanujan (settled by Flajolet in 1995). https://www.cambridge.org/core/journals/advances-in-applied-probability/article/gaussian-expansions-and-bounds-for-the-poisson-distribution-applied-to-the-erlang-b-formula/76DB4F08E5A5DE90D85A90E9D0788DA7

22 December, 2022 at 4:09 pm

Terence TaoThanks for the reference! Somehow we did not locate it during our own literature search. In principle these sharp asymptotics should recover the cruder upper bounds in Proposition 3, although actually doing the calculations for this seem to be about as complicated as the proof of Proposition 3 already provided in the blog post…

12 January, 2023 at 11:31 pm

AlexanderIt seems that something similar can be also done for the Binomial distribution https://ieeexplore.ieee.org/abstract/document/9546799

5 February, 2023 at 8:47 am

ArmanProf you’re amazing just wanted let you know)

19 February, 2023 at 2:36 pm

TheoHi Terry, thanks for the post; it’s very useful. I believe the RHS above “Thus the sum is dominated…” should be \frac{1}{1+u}\frac{\lambda^k}{k!} instead of \frac{1}{1+u}\frac{\lambda^{k+1}}{(k+1)!}. Thanks again!

[Corrected, thanks – T.]