Expectations cannot be estimated
August 24, 2007
Expectations cannot be estimated, to any accuracy, with any sample size, unless stringent prior assumptions are made. With a sample of size n, there is a good chance (about 1/e^alpha) that there have been no samples lying in the top alpha/n of the distribution, meaning that the contribution of that part of the distribution is not represented in that sample. Since that contribution could be arbitrariliy large, there is no way to give a finite upper confidence bound of level higher than 1-1/e^alpha. This is true for any alpha, however small, showing that a UCB cannot be established at any level.
The standard way around this problem is to claim that the central limit theorem implies that the mean of a sample is distributed approximately normally. This is clearly untruein any rigorous sense – as the argument above shows – unless some distributional assumptions are made (a normal distribution, a bounded distribution, or some assumptions about the moments of the distributions). No such assumptions are usually explicitly made, although some nominal efforts to discard “outliers” are standard. In serious statistical circles, the authors and the readers are assumed to be adults who do not nitpick.