January 29, 2008
[Owing to] the power of habit the simplest idea, if unfamiliar, has as great difficulty in making its way to the mind as a far more complicated one.
John Stuart Mill, Representative Government (1861)
January 27, 2008
The main discrepancy between findings in the IFHS paper and those of Burnham et al. is not in the total excess deaths, but in the specific category “violent deaths”. It is therefore of interest to examine whether the classification methods used in those papers to assign deaths to the “violent” category are identical, and whether any differences in the classifications could account for some of the different findings. I notice two points on which the papers’ methodologies of classification differ: One is that Burnham et al. examined death certificates, while IFHS did not. A second difference is that they use different categories for injuries. Burnham et al. use two “accident” categories. One of those is included in the “non-violent” section, the other in the “violent” section. IFHS has no “violent accident” category, and has two categories, “road accidents” and “unintentional injuries”, counting injuries within the “non-violent” classification.
January 23, 2008
Justifying the factor used to account for under-reporting of deaths
The IFHS sample is a very low mortality group. For the pre-invasion period of about 1.25 years, the group, which contains 61,636 individuals, experienced 204 deaths. For post-invasion period (about 3.25 years) , the group experienced 1,121 deaths. These translate to 2.65 deaths per 1000 person-years, and 5.6 deaths per 1000 person-years, respectively.
The IFHS sample was not designed to be equal weight to begin with (i.e., some people had higher chance of being selected than others), and biases certainly increased after some clusters were dropped from the sample because they were considered too dangerous to access. Re-weighting (in some way – it is not clear whether the adjustment procedure for total mortality is the same one as that used for violent deaths) by the IFHS authors yields significantly higher mortality rate for the pre-invasion period: 3.17 deaths per 1000 person-years (95% CI 2.70–3.75), and somewhat higher rate for post-invasion period: 6.01 deaths per 1000 person-years (95% CI 5.49–6.60).
The low mortality rate is probably due to a certain extent to a young population – I was unable to find the breakdown of the sample by age (neither in the IFHS paper, nor in the report). The mortality rate is much lower, however, than that of the sample of Burnham et al. – not only for the post-invasion period but also for the pre-invasion period. Burnham et al. estimated the pre-invasion mortality as 5·5 deaths per 1000 person-years (95% CI 4·3–7·1). Thus the fact that IFHS authors find it necessary to adjust their estimate upward seems justified.
The problem is, however, to find the appropriate adjustment factor, and to account for any uncertainty in that factor. The authors mention (p. 486), with reservation, the figure of 62% as the proportion of deaths being reported (i.e., 38% go unreported). They also mention (p. 487) modeling the proportion going unreported as being a normal variable with mean 35% and “95% uncertainty range, 20 to 50”, which I take to mean that the standard deviation is (50-35) / 1.96 =~ 7.5%. The ratio between the Burnham et al. and IFHS point estimates for the war mortality rate is 5.5 / 3.17 = 1.74, which would stand for 42% of the deaths going unreported.
January 20, 2008
An explanation of the concept of effective sample size is here. This post applies this concept to a particular study.
Extrapolating from a sample taken in a reference area containing a relatively small number of deaths in order to estimate the number of deaths in the area containing the bulk of the deaths in Iraq is problematic even if we assume that the extrapolation factor is known precisely.
The reference area is the “three provinces that contributed more than 4% each to the total number of deaths reported for the period from March 2003 through June 2006″. By the design of the sample, there are no more than 180 clusters in these provinces (each province was sample with 54 clusters, except for Nineveh which was sample with 72 clusters). Table 3 of the supplementary material of the paper shows that sample sizes in each governorate. Taking the largest 3 samples in the “High mortality governorates” section (Nineveh, Babylon and Basra) gives an upper bound on the total sample in the reference area of 14,891 people. Multiplying that bound with the mortality rate in the area (Table 2 of the supplementary material) – 0.83 death per 1000 person-years – gives an upper bound for the number of violent deaths in the sample in the reference area of 14,891 x 0.83 / 1000 x 3.33 = 41.19.
That is, over 70% of the deaths in the IFHS estimate – those in Baghdad, Anbar and the three reference governorates – are based on a sample containing about 40 deaths. The estimate of the number of deaths in those areas is generated simply by multiplying those 40 deaths by a factor and thus any uncertainty in the number 40 is directly translated into uncertainty in the estimator of the total number of deaths in the areas.
Reference to the large total number of clusters in the IFHS (almost 1000 clusters visited) is therefore misleading. The determining factor of the estimate is the data collected in a much smaller number of clusters – generating a small number of recorded deaths. The uncertainty in the estimate is correspondingly large.
January 19, 2008
The IFHS surveyors did not visit all of the clusters in their sample. Those areas that were judged to be dangerous went unsurveyed. Most of the unsurveyed clusters were in Baghdad (31 out of 96) and in Anbar governorate (71 out of 108). A smaller number of clusters went unsurveyed in Nineveh (12 out of 72) and in Wasit (1 out of 54).
It appears that the missing clusters in Nineveh and Wasit were ignored. This has the potential of introducing significant bias into the estimate of mortality in those governorates. Removing the clusters in areas within a governorate that were considered dangerous turns the estimator into an estimator of the non-dangerous areas. It seems likely that the mortality in the non-dangerous areas only would be smaller than in the governorate as a whole. If the dangerous areas had seen a significant part of the deaths in the governorate, then removing them from the sample would bias the estimator significantly downward, and the more dangerous a governorate is, the more significant the bias is.
Indeed, as can be expected from such a differential bias, the death rates in the various governorates in the IFHS sample (before any adjustments) show reduced variation compared to the death rates in the two sources that the IFHS authors compare themselves to – Burnham et al. and Iraq Body Count (IBC).
January 18, 2008
According to the description of the sampling method of IFHS (both in the paper itself and in the supplementary material), 10 households were surveyed in each cluster, and there were (with few exceptions) 3 x 18 = 54 clusters per governorate. In such a set-up there should be no correlation between the number of people surveyed in each governorate and the size of the population in the governorate.
The chart below was generated using the data in table 2 of the supplementary material of the paper. Each of the 18 point corresponds to a governorate. The x-axis value is the population size in the governorate (calculated as the mean of the 5 values, for 5 different time points, given in the table). The y-axis value is the average sample per cluster in the governorate. The total sample size given in the table was divided by the actual number of clusters visited (54 for most, 65 for Baghdad, 60 for Nineveh, 37 for Al-Anbar and 53 for Wasit). A strong correlation between the two is evident. The correlation factor is 0.94. After removing the two outliers – Baghdad and Nineveh – the correlation factor is 0.72 (p-value 3×10^-5).
Assuming that the description of the sampling method is correct, then it seems that the only way such a correlation could show up is if the size of the population in the a governorate is strongly correlated with the average household size in the the governorate. This is possible but seems unlikely a-priori. Another surprising finding in such a case would be the sheer range of household size variation – ranging from less 3.2 people per household in 3 of the smallest governorates to almost 20 people per household in Baghdad.
The possibility that the description of the sampling method is incorrect presents itself strongly.
January 17, 2008
Reviewing the IFHS study, I found 5 problems with the science of the study. I believe that taken together (but particularly the first three points, regarding the crucial role extrapolation plays in arriving at the estimates in the study, and regarding the ratio of under-reporting) those problems should be seen as grave. At the very least, they should be seen as putting the findings of the IFHS on equal or inferior footing to those of Burnham et al., rather than as being on superior footing due to the nominal large size of the sample in the IFHS.
January 17, 2008
The recent release of the IFHS study (via Deltoid) which put the number of Iraqis killed violently during the first 3 years and 4 months after the invasion at a mere 150,000 has generated the expected sigh of relief in the media (e.g., 1, 2, 3). Having previously been implicitly blamed for supporting an endeavor that generated 4 times as many violent deaths over the same period (Burnham et al., a.k.a. the second Lancet study), this new figure is celebrated as vindication.
Going over the pattern of media response to the IFHS study would be informative, but would produce unsurprising results. I therefore touch on only one point which also bears on the issue of any anti-IFHS bias by “Lancet supporters”.
Conveniently, the reports ignore the question of how many Iraqis died non-violently following the invasion as a result of the widespread devastation and breakdown of organization. This was helped to a large extent by the fact that the study itself, while giving an estimate for violent deaths, does not give an estimate for excess mortality – it merely gives pre-war and post-war mortality rate estimates. It is, however, quite easy to use those mortality rate estimates to generate an estimate for the excess mortality. Using those figures and applying the method used in the study to account for under-reporting, the estimate of excess deaths during the first 40 months after the invasion comes to around 400,000. This is not significant disagreement with the different from the Lancet figure.
The major disagreement between these two studies, therefore, is about what is the proportion of excess deaths were violent, rather than how many Iraqis died as a result of the invasion. The IFHS has it that only about 1/3 of the excess deaths are caused by violence – Burnham et al. put that figure at about 90%. In this disagreement, based on a-priori considerations, it seems that the IFHS findings are more reasonable. It would be a miracle if in a country of 30 million, there could be enough violence to cause hundreds of thousands of violent deaths, and yet non-violent mortality would barely budge.
In my mind, it is the total number of excess deaths that is of interest when trying to decide what is the cost in lives that is attributable to the invasion. It makes little difference to an Iraqi whether his child died when she was hit a bullet or when she was poisoned by contaminated drinking water. I also expressed the view that the ratio 10:1 violent to non-violent deaths estimated by Burnham et al. is problematic long before the IFHS was published.
I thus see no need to bash the IFHS study. It is however quite interesting to find that when examining the IFHS in a little detail (and it is really no more than a cursory examination that I undertook) several significant problematic points manifest themselves – I will enumerate them in an upcoming post (here). The fact that such problems exist is interesting for several reasons:
- The immediate interest is regarding the validity of the findings in the context of assessing the reality in Iraq and the impact of the decision to invade it.
- A second point of interest is the matter of how points of weakness are handled by various players (especially, powerful players, such as corporate media and the government). When do such points of weakness get to be played up and seen as undermining the credibility of a study and when do they get to be ignored or played down as mere nitpicking.
- An additional point is the fact that papers with such obvious weaknesses can pass the vaunted peer-review barrier – what does this imply about the process of peer-review and the politics of science?
- Finally, the broad epistemological issue – what can we know about what happens in other places? What do accounts, including scientific, establishment sanctioned accounts, teach us?
January 14, 2008
The question of producing a confidence interval by intersecting two or more confidence intervals calculated for the same unknown parameter, based on either the same data or different data, is occasionally of interest. This is a form of meta analysis – combining information from different sources – which is attractive since it can be done using only a minimal amount of information from each of the data sources, namely the confidence intervals alone, and since the calculation is easy and the result has an intuitive appeal.
It turns out that intersecting confidence regions (and specifically intersecting confidence intervals) does indeed produce confidence regions. This can be seen as a special case of the confidence region arithmetic.
Let C1 and C2 be level 1-p1 and 1-p2 confidence regions for a parameter θ. The probability that C1 ∩ C2 does not contain θ is the probability that either C1 does not contain θ or that C2 does not contain θ. This is less than or equal to the sum of the probabilities of those two events – p1 + p2. Therefore, C1 ∩ C2 is a level 1-p1-p2 confidence region for θ. Thus, a smaller region is attained, but at a reduced confidence level.
Note that an assumption of independence was not made in this calculation. If independence is assumed, then the level of the intersection confidence region can be tightened up to (1-p1)(1-p2). However, if p1 and p2 are small, this does not make much of a difference.
January 7, 2008
Democracy in large groups relies on delegation of power. Delegation is the formal concentration of significant political power in the hands of a small subset of members in such a way that it is difficult to strip those members of that power at least over a certain time period. It is an interesting theoretical question whether there may be a way to run a large group in a democratic manner without delegation (such as some pure form of direct democracy). However, it is a fact that all known forms of government that are democratic, or that claim to be democratic, rely on delegation.