If , a Poisson random variable
with mean
is a random variable taking values in the natural numbers with probability distribution
Proposition 1 (Bennett’s inequality) One hasfor
and
for
, where
From the Taylor expansion for
we conclude Gaussian type tail bounds in the regime
(and in particular when
(in the spirit of the Chernoff, Bernstein, and Hoeffding inequalities). but in the regime where
is large and positive one obtains a slight gain over these other classical bounds (of
type, rather than
).
Proof: We use the exponential moment method. For any , we have from Markov’s inequality that
Remark 2 Bennett’s inequality also applies for (suitably normalized) sums of bounded independent random variables. In some cases there are direct comparison inequalities available to relate those variables to the Poisson case. For instance, supposeis the sum of independent Boolean variables
of total mean
and with
for some
. Then for any natural number
, we have
As such, for
small, one can efficiently control the tail probabilities of
in terms of the tail probability of a Poisson random variable of mean close to
; this is of course very closely related to the well known fact that the Poisson distribution emerges as the limit of sums of many independent boolean variables, each of which is non-zero with small probability. See this paper of Bentkus and this paper of Pinelis for some further useful (and less obvious) comparison inequalities of this type.
In this note I wanted to record the observation that one can improve the Bennett bound by a small polynomial factor once one leaves the Gaussian regime , in particular gaining a factor of
when
. This observation is not difficult and is implicitly in the literature (one can extract it for instance from the much more general results of this paper of Talagrand, and the basic idea already appears in this paper of Glynn), but I was not able to find a clean version of this statement in the literature, so I am placing it here on my blog. (But if a reader knows of a reference that basically contains the bound below, I would be happy to know of it.)
Proposition 3 (Improved Bennett’s inequality) One hasfor
and
for
.
Proof: We begin with the first inequality. We may assume that , since otherwise the claim follows from the usual Bennett inequality. We expand out the left-hand side as
Now we turn to the second inequality. As before we may assume that . We first dispose of a degenerate case in which
. Here the left-hand side is just
It remains to consider the regime where and
. The left-hand side expands as
The same analysis can be reversed to show that the bounds given above are basically sharp up to constants, at least when (and
) are large.
12 comments
Comments feed for this article
13 December, 2022 at 11:54 am
Anonymous
Is there a known characterization of the class of distributions for which Bennett’s inequality is sharp?
14 December, 2022 at 10:37 am
Terence Tao
I’m pretty sure Bennett’s inequality is almost never going to be completely sharp – the step in the proof where Markov’s inequality is applied is always going to lose something (since the Poisson random variable – or whatever other variable one is applying the inequality to – is not going to be a step function random variable). The main advantage of the Bennett inequality is that it has a clean statement and is reasonably close to optimal, but the point is that there is still a little bit of room for improvement. (There are several other improved versions of Bennett’s inequality in the literature, though so far I have not been able to use those to obtain a proof of Proposition 3 that was shorter than the direct calculation provided here.)
15 December, 2022 at 12:56 pm
Anonymous
Is the exponent
of the gain factor
the best possible?
16 December, 2022 at 9:08 am
Terence Tao
Yes (staying in the regime
for simplicity). For instance, suppose that
is an integer and
. Then
by the Stirling approximation.
13 December, 2022 at 7:39 pm
Xiaohui Chen
Dear Prof. Tao,
I am not sure if there is an exact reference containing your sharpened bound than the standard Bennett’s inequality. However, your bound reminds me the connection to the standard Gaussian tail probability bound
, which leverages an integration by parts argument that can be generalized to the Poisson distribution.
Let
be a
random variable, my key observation is to note that we can write
for any
.
Using summation by parts (where the Poisson tail goes to zero sub-exponentially fast), we can write the tail probability as
Note that the second term on the RHS of the last expression is negative, so we can drop it to yield the following bound:
Using the above relation again, we have
Then,
Now, running the same Stirling's approximation as in your argument to simplify the RHS of the last inequality, we obtain your upper bound in Proposition 3.
Similar lower bound can be deduced with this argument.
I think we can run the summation of parts argument to get even higher-order refinement of the standard Bennett's inequality with improvement of a factor as O(poly(\lambda)) in non-Gaussian regime.
Thus, I believe your bounds can be interpreted as a discrete version of the Mills ratio in the standard Gaussian distribution case.
14 December, 2022 at 7:05 am
Upper bounds on left and right tails of Poisson probability
[…] Terence Tao published a blog post on bounds for the Poisson probability distribution. Specifically, he wrote about Bennett’s […]
14 December, 2022 at 7:57 am
Anonymous
Tao’s ego is too big. Stop doing self-promotion and start doing relevant mathematics.
22 December, 2022 at 3:52 am
Bert Zwart
Thanks for this. I would like to point towards other bounds for the Poisson distribution, capitalizing upon the connection with the Lambert W function. We also noticed a connection with a conjecture of Ramanujan (settled by Flajolet in 1995). https://www.cambridge.org/core/journals/advances-in-applied-probability/article/gaussian-expansions-and-bounds-for-the-poisson-distribution-applied-to-the-erlang-b-formula/76DB4F08E5A5DE90D85A90E9D0788DA7
22 December, 2022 at 4:09 pm
Terence Tao
Thanks for the reference! Somehow we did not locate it during our own literature search. In principle these sharp asymptotics should recover the cruder upper bounds in Proposition 3, although actually doing the calculations for this seem to be about as complicated as the proof of Proposition 3 already provided in the blog post…
12 January, 2023 at 11:31 pm
Alexander
It seems that something similar can be also done for the Binomial distribution https://ieeexplore.ieee.org/abstract/document/9546799
5 February, 2023 at 8:47 am
Arman
Prof you’re amazing just wanted let you know)
19 February, 2023 at 2:36 pm
Theo
Hi Terry, thanks for the post; it’s very useful. I believe the RHS above “Thus the sum is dominated…” should be \frac{1}{1+u}\frac{\lambda^k}{k!} instead of \frac{1}{1+u}\frac{\lambda^{k+1}}{(k+1)!}. Thanks again!
[Corrected, thanks – T.]