p-rep and the myth of rational science

September 16, 2007

I have just found a short article of Killeen that appears to be a rejoinder to comments on his original p-rep paper. Unfortunately, I have not been able to find the comments to which Killeen is responding, but it appears, from the rejoinder (Error and Correction section), that the fundamental error that I referred to has been pointed out in one of the comments, Doros and Geier. It also seems that another comment by Macdonald pointed out that Killeen’s claim that “the probability of replication” can be calculated without knowing the unknown parameter of the distribution cannot be true.

Somehow, p-rep survived these issues. Superficially, this seems incredible. The entire analysis and rationale given by Killeen for p-rep in his original paper rest on an argument that was discovered to be wrong, and yet the ideas presented by that paper are still considered worth discussing, and even gain official support by the publishing establishment.

On the immediate level, Killeen maneuvers around the errors by invoking a Bayesian argument, claiming his p-rep estimate is the result of a Bayesian analysis with flat priors. This flies in the face of the anti-Bayesian stance in Killeen’s original paper (for example, in the abstract, Killeen claims that he avoids the “Bayesian dilemma”). I guess the audience is expected to be either too ignorant or too polite to mention this about-turn.

On a deeper level, however, the persistence of p-rep is another demostration of the fact that science is primarily a political activity rather than a rational activity.

I cannot determine exactly what is the political constellation that makes discarding p-rep inconvenient. An important factor is probably that among the audience of the paper, only a small minority could detect or understand the error. It seems reasonable to speculate that we are witnessing the effect told about in the tale of The Emperor’s New Clothes: Killeen’s original paper generated enough interest that officially dismissing it six months after it was published would have embarassed some of the interests involved (the author, the reviewers and the publishers are obvious examples). It is more convenient therefore, at this point, to pretend that the errors found are not of any consequence than to officially admit that such basic mistakes were published in a leading psychological journal and taken seriously by many.

I am not sure if that would be enough to explain why Pschologycal Science (the same journal in which the 2005 paper appeared) took the extra step of making publishing p-rep official policy. It could be that the wheels were already in motion by the time the mistakes were discovered. It could be that whoever makes such decisions in the paper was not able to detect that the case for p-rep was completely undermined, and that no one wanted to be the one to disclose such information.

Other scenarios may be possible, but it is quite clear that none of them would follow the ostensible rules of scientific conduct.


8 Responses to “p-rep and the myth of rational science”

  1. James Annan Says:

    This sounds rather similar to the situation in climate science.

  2. yoramgat Says:


    My personal experience and impression from talking to others is that the situation is common in academic circles, cutting across disciplines – ideas are not judged based on merit but based on political grounds.

    The politics of those matters are usually small-scale, personal, academic politics, rather than large-scale public policy politics. This seems to me to be true even in the case of your paper which does touch upon a matter that has public policy implications. (BTW, how did things turn out with that paper?)

    I believe that the way the academic system works (and in particular, the venerable peer review process) virtually guarantees that academic activity would be political. I do intend to post more about those matters – including my rather radical ideas for improving the situation.

  3. James Annan Says:

    At that point, I gave up on trying to get it published and just stuck it on the arxiv. I do sometimes think about resurrecting it but 5 rejections (in various formats) is a strong argument for just getting on with life.

    It does seem like the message it conveys is slowly permeating the climate science community anyway, so perhaps continuing the crusade at this point would be more about credit and status than science.

    The EGU has an interesting approach to some of its journals which may help to get through the usual “gatekeepers” eg http://www.climate-of-the-past.net/

  4. yoramgat Says:

    Many researchers I speak with consider the method of peer review flawed, but most think it is the best there can be. A few think that it is better not to have any filtering at all, essentially relying on arxiv-like repositories only. These two approaches seem to be the only two options being widely considered. The two-stage process of Climate of the Past is something of a hybrid, but essentially falling in the same spectrum.

    What do you think would be a reasonable publication process?

  5. I have speculated about how one could make an abstract, simplified, idealised version of the “reputation game” of academics.

    I imagine something like this: In the initial position, all participants have equal credibility (or reputation, or whatever we should call it), but they also know that some of the other players are more worth listening to than others – some have novel ideas, some are cranks. Pretend we give each player a hundred, divisible “reputation credits” and ask them to distribute them among the other players any way they feel like.

    Now some will have ended up with more than others. The thing is, if you can assume that people who are skilled in their field are also better at evaluating other people’s skill, you should ask them to do it again (with their new reputation credit totals), and again, and again… until it hopefully stabilizes. Then the participants get to agree on the agenda, the terms, the research budget etc. with influence according to their share.

    I think this is a fair method, and a reasonable one as well, as long as people agree that

    1. Some are better than others in the field we are discussing

    2. The better ones are in turn better at recognising this skill.

    Moreover, I think that this abstract game I’ve sketched here is not too far away from how the sciences work, or try to work, and also how “fine” arts work (or fail to work, when the assumptions don’t hold, or the realisation of the critical process is poor).

  6. yoramgat Says:


    Thanks for this thought.

    I agree that there is some validity to your model as a descriptive model – i.e., reputations (in science as in many other areas) are propagated through a network of connections whose strength depends on reputations.

    I disagree about the normative statement – I don’t think that this method tends to produce reputations that are highly correlated with objective quality.

    It is interesting that your model is reminiscent of the PageRank algorithm that supposedly propelled Google to fame and fortune when applied to rank web pages. I wonder if someone tried to carry out a mathematical analysis of PageRank with the aim of establishing some normative (i.e., objective function optimizing) properties for this algorithm.

    I think this topic is wide and important enough to merit a thorough analysis, so I’ll try to collect my thoughts on this subject and write a post about it.

  7. […] In fact, I believe the authors make a better point, namely, one close to the one I am making (repeatedly) – that science, including normal, everyday science, is a political rather than a rational activity […]

  8. Yoram Gat Says:

    Much later: Psychological Science no longer recommends the use of it, but the p-rep lives on:

    Science Watch: Why do you think your paper is highly cited?

    Peter Killeen: It provides a third alternative for inferential statistics—neither Frequentist nor standard Bayesian—but predictive: Inferences are to future research outcomes, not to parameters. That a brave editor recommended that it be used in his journal instead of Null-Hypothesis Statistical Tests (NHST) was crucial. The inevitable controversy pursuant to that decision has further raised its profile.

    Also, Killeen (or someone using his name) edits the p-rep Wikipedia entry.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s