Let Bi, i = 1, …, n be independent Bernoulli variables with parameters qi, i = 1, …, n, respectively. Let S be their sum. For convenience, assume q1 ≥ q2 ≥ ··· ≥ qn. I wish to bound tightly from above the probability that S is greater or equal to some l, having the bound depend solely on ES = q1 + ··· + qn.

Clearly, if l ≤ ES, then the tightest bound is 1. This is attained by setting q1 = ··· = ql = 1.

This example shows that while the variance of S is maximized by setting qi = ES / n, i = 1, ···, n, at least for some values, l, P(S ≥ l) is maximized by having the Bi not identically distributed.

Proposition 1:

For every l, P(S ≥ l) is maximized when q1 = ··· = qmo = 1, qmo+1 = ··· = qn-mz = q, and qn-mz+1 = ··· = qn = 0, for some mo and mz, and for q = (ES – mo) / (n – mo – mz).

Proof:

Assume that 1 > qi > qj > 0. Let S’ = S – Bi – Bj. Then

P(S ≥ l) = P(S’ ≥ l) + p1 (qi + qj – qi qj) + p2 qi qj,

where p1 = P(S’ = l – 1) and p2 = P(S’ = l – 2). Thus, keeping qi + qj fixed, but varying the proportion between them, P(S ≥ l) is a linear function of qi qj. Unless p1 = p2, P(S ≥ l) will be increased by varying qi and qj – with a maximum either when they are equal or when one of them is zero or one.

Therefore, P(S ≥ l) cannot be at a maximum if there exist 1 > qi > qj > 0, unless p1 = p2. But in that case, the same probability can be achieved by setting the parameter of Bi to be q’i = 0 (if qi + qj < 1) or q’i = 1 (otherwise), and the parameter of Bj to be q’j = qi + qj – q’i. Therefore, in that case, there would exist a set of parameters, q’1, …, q’n, that would achieve that same probability but with fewer parameters that are not equal to zero or one. Thus, in the set of parameter settings maximizing P(S ≥ l), there exists a solution – namely the one which maximizes the number of parameters with extreme values (zeros and ones) – in which there is only one non-extreme value. ¤

The next step is to investigate which specific parameter setting correspond to various combinations of l and ES.