## Deviations from the mean of a sum of independent, non-identically-distributed Bernoulli variables, continued

### October 29, 2011

*Continuing the investigation initiated here, and applying the same notation.*

**Proposition 2:**

For every *l, l >= ES*, *P(S ≥ l)* is maximized when *q _{1} = ··· = q_{n-mz} = q = ES / (n – m_{z})*, and

*q*, for some

_{n – mz + 1}= ··· = q_{n}= 0*m*.

_{z}That is, in the terminology of Proposition 1, for all non-trivial cases, *m _{o} = 0*, and thus the probability of deviation is maximized by some Bernoulli variable (rather than a shifted Bernoulli variable).

**Proof:**

Let *ES < l*. Let the parameters *q _{1}, …, q_{n}* be as described in Proposition 1, with m

_{o}> 0. Let

*S’ = S – B*. Then

_{1}– B_{mo + 1}= S – 1 – B_{mo + 1}*S’ – m*is a Binomial variable with parameters

_{o}+ 1*n’ = n – m*and

_{o}– m_{z}– 1*q*whose expectation is

*.*

**E**S – m_{o}– q = (**E**S – m_{o}) n’ / (n’ + 1)The density of a Binomial variable with parameters *n’* and *q* is unimodal with a mode smaller or equal to *max((n’ + 1) q – 1, 0)*. Therefore, the density of *S’* is unimodal with mode smaller or equal to the maximum of *m _{o} – 1* and

*m _{o} – 1 + ES’ (n’ + 1) / n’ – 1 = m_{o} – 1 + (ES – m_{o}) – 1 = ES – 2*.

Since by assumption *ES < l* (and therefore *m _{o} < l*) the mode of the distribution of

*S’*is lower or equal to

*l – 2*. Thus

*P(S’ = l – 2) > P(S’ = l – 1)*and therefore, following the argument in the proof of Proposition 1,

*P(S” >= l) > P(S >= l)*, where

*S” = S’ + B’*and

_{1}+ B’_{mo + 1}*B’*and

_{1}*B’*are independent Bernoulli variables both with parameter

_{mo + 1}*1/2*. ¤

A similar argument proves

**Proposition 3:**

For every *l, l > ES + 1*, *P(S ≥ l)* is maximized when *q _{1} = ··· = q_{n} = ES / n*.

That is, unless *l ≤ ceil(ES)*, *P(S >= l)* is maximized by a Binomial variable with parameters *n* and *ES / n*. It turns out that the special case that motivated this investigation is indicative of the situation in a rather limited domain. For most cases, the same Bernoulli variable that maximizes the variance of the family under consideration also maximizes the probability of deviation.

Finally, the domain is limited but is not empty.

**Proposition 4:**

For *l + 2 – 1 / l – (1 + 1 / l) ^{l + 1} < ES*,

*P(S*, where

_{l}>= l) > P(S_{l + 1}>= l)*S*is a Binomial with parameters

_{l}*l*and

*, and*

**E**S / l*S*is a Binomial with parameters

_{l + 1}*l + 1*and

*.*

**E**S / (l + 1)**Proof:**

By direct comparison of *(ES / l) ^{l}* and

*(ES / (l + 1))*. ¤

^{l + 1}+ (l + 1) (ES / (l + 1))^{l}(1 – ES / (l + 1))A numerical investigation shows that indeed in the range *l – 1 < ES < l*, the probability of deviation is maximized by Binomial variables with parameters *n* and *ES / n* for some finite *n*. The diagrams below illustrate the situation (generated for *l = 10*):

The solid thin lines correspond to a sum of 10 IID Bernoulli variables, while the dashed and dotted lines correspond to sums of 11 and 20 variables, respectively. The thick solid lines correspond to a Poisson variable – the limiting distribution as the number of IID Bernoulli variables in the sum increases indefinitely.

It is worth noting that the same line of argument used above can be used to show that shifted Binomials maximize the density values of sums of independent non-identically distributed Bernoulli variables, i.e., *P(S = l)*.