Pachter’s P-value Prize’s Post-Publication Peer-review Paradigm

Several weeks ago my Berkeley colleague Lior Pachter posted a challenge on his blog offering a prize for computing a p-value for a claim made in a 2004 Nature paper. While cheeky in its formulation, Pachter had an important point – he believed that a claim from this paper was based on faulty reasoning, and the p-value prize was a way of highlighting its deficiencies.

Although you might not expect the statistics behind a largely-forgotten claim from an 11 year old paper to attract significant attention, Pachter’s post has set of a remarkable discussion, with some 130 comments as of this writing, making it an incredibly interesting experiment in post-publication peer review. If you have time, you should read the post and the comments. They are many things, but above all they are educational – I learned more about how to analyze this kind of data, and about how people think about this kind of data, here than I have anywhere else.

And, as someone who believes that all peer review should be done post-publication, there’s also a lot we can learn from what’s happening on Pachter’s blog.

Pre vs Post Publication Peer Review

I would love to see the original reviews of this paper from Nature (maybe Manolis or Eric can post them), but it’s pretty clear that the 2 or 3 people who reviewed the paper either didn’t scrutinize the claim that is the subject of Pachter’s post, or they failed to recognize its flaws. In either case, the fact that such a claim got published in such a supposedly high-quality journal highlights one of the biggest lies in contemporary science: that pre-publication peer review serves to defend us from the publication of bad data, poor reasoning and incorrect statements.

After all, it’s not like this is an isolated example. One of the reasons that this post generated so much activity was that it touched a raw nerve among people in the evolutionary biology community who see this kind of thing – poor reasoning leading to exaggerated or incorrect claims – routinely in the scientific literature, including (or especially) in the journals that supposedly represent the best of the best in contemporary science (Science, for example, has had a string of high-profile papers that turned out to be completely bogus in recent years – c.f. arsenic DNA).

When discussing these failures, it’s common to blame the reviewers and editors. But they are far less the fault of the people involved than they are an intrinsic problem with pre-publication. Pre-publication review is carried out under severe time pressure by whomever the editors managed to get to agree to review the paper – and this is rarely the people who are most interested in the paper or the most-qualified to review it. Furthermore, journals like Nature, while surely interested in the accuracy of the science they publish, also ask reviewers to assess its significance, something that at best distracts from assessing the rigor of a work, and often is in conflict with it. Most reviewers take their job very seriously, bit it simply impossible for 2 or 3 somewhat randomly chosen people who read a paper at a fixed point in time and think about it for a few hours to identify and correct all of its flaws.

However – and this is the crux of the matter for me – despite the fact that pre-publication peer review simply can not live up to the task it is assigned, we pretend that it does. We not only promulgate the lie to the press and public that “peer reviewed” means “accurate and reliable”, we act like it is true ourselves. Despite the fact that an important claim in this paper is – as the discussion on the blog has pointed out – clearly wrong, there is no effective way to make this known to readers of the paper, who are unlikely to stumble across Pachter’s blog while reading Nature (although I posted a link to the discussion on PubMed Commons, which people will see if they find the paper when searching in PubMed). Worse, even though the analyses presented on the blog call into question one of the headline claims that got the paper into Nature in the first place, the paper will remain a Nature paper forever – its significance on the authors CVs unaffected by this reanalysis.

Imagine if there had been a more robust system for and tradition of post-publication peer review at the time this paper was published. Many people (including one of my graduate students) saw the flaws in this analysis immediately, and sent comments to Nature – the only visible form of post-publication review at the time. But they weren’t published, and concerns about this analysis would not be resurfaced for over a decade.

The comments on the blog are not trivial to digest. There are many threads, and the comments include those that are thorough and insightful with others that are jejune and puerile. But if you read even part of the thread you come away with a far deeper understanding of the paper, what it found and what aspects of it are right and wrong than you get from the paper itself. THIS is what peer review should look like – people who have chosen to read a paper spending time not only to record their impressions once, but to discuss it with a collection of equally interested colleagues to try and arrive at a better understanding of the truth.

The system is far from perfect. but from now on anytime I’m asked what I mean by post-publication peer review, I’ll point them to Lior’s blog.

One important question is why doesn’t this happen more often? A lot of people had clearly formed strong opinions about the Lander and Kellis paper long before Lior’s post went up. But they hadn’t shared them. Does someone have to write a pointed blog post every time they want to inspire its results to be reexamined by the community?

The problem is, obviously, that we simply don’t have a culture of doing this kind of thing. We all read papers all the time, but rarely share our thoughts with anyone outside of our immediate scientific world. Part of this is technological – there really isn’t a simple system tied to the literature on which we can all post comments on papers that we have read with the hope that someone else will see them. PubMed Commons is trying to do this, but not everyone has access. And other than they the systems are just not that good yet. But this will change. The bigger challenge is getting people to use it once good technology for post-publication peer review.

Developing a culture of post publication peer review

The biggest challenge is that this kind of reanalysis of published work just isn’t done – there simply is not a culture of post-publication peer review. We lack any incentives to push people to review papers when they read them and have opinions that they feel are worth sharing. Indeed, we have a variety of counterincentives. A lot of people ask me if Lior is nuts for criticizing other people’s work so publicly. To many scientists this “just isn’t done”. But the question we should be asking is not “Why does Lior do this?” but rather “Why don’t we all?”.

When we read a paper and recognize something bad or good about it, we should look at it as a duty to share it with our colleagues. This is what science is all about. Oddly, we feel responsible enough for the integrity of the scientific literature that we are willing to review papers that often do not interest us and which we would not have otherwise read, yet we don’t feel that way about the more important process of thinking about papers after they are published. Somehow we have to transfer this sense of responsibility from pre- to post- publication review.

An important aspect of this is credit. A good review is a creative intellectual work and should be treated as such. If people got some kind of credit for post-publication reviews, more people would be inclined to do them. There are lots of ideas out there for how to create currencies for comment, but I don’t really think this is something that can be easily engineered – it’s going to have to evolve organically as (I hope) more people engage in this kind of commentary. But it is worth noting that Lior has, arguably, achieved more notice for his blog, which is primarily a series of post-publication reviews, than he has for his science. Obviously this is not immediately convertible classical academic credit, but establishing a widespread reputation for the specific kind of intellectualism manifested on his blog, can not but help Lior’s academic standing. I hope that his blog inspires people to do the same.

Of course not everybody is a fan of Lior’s blog. Several people who I deeply respect have complained that his posts are too personal, and that they inspire a kind of mob mentality in comments in which the scientists whose work he writes about become targets. I don’t agree with the first concern, but do think there’s something to the second.

So long as we personalize our scientific achievements, attacks on them are going to feel personal. I know that every time I receive a negative review of a paper or grant, I feel like it is a personal attack. Of course I know that this generally isn’t true, and I subscribe to belief that the greatest respect you can show another scientist is to tell them when you think they’ve made a mistake or done something stupid. But, nonetheless, negative feedback still feels personal. And it inspires in most of us an instinctive desire to defend our work – and therefore our selves – from these “attacks”. I think the reason people feel like Lior’s blogs are attacks is that they put themselves into the shoes of the authors he is criticizing and feel attacked. But I think this is something we have to get over as scientists. If the critique is wrong, than by all means we should defend ourselves, but conversely we should be able to admit when we were wrong, to have a good discussion about what to do next, and move on, all the wiser for it.

However, as much as I would like us all to be thick skinned scholars about to take it and dish it out, reality is that this is not the case. Even when the comments are civil, I can see how having a few dozen people shredding your work publicly could make even the most thick skinned scientist feel like shit. And if the authors of the paper had not been famous, tenured scientist at MIT, the fear of negative ramifications from such a discussion could be terrifying. I don’t think this concern should lead to people feeling reluctant to jump into scientific discussions – even when they are critical of a particular work – but I do think we should exercise extreme care in how we say things. And rule #1 has to be to restrict comments to the science and not the authors. In this regard, I was probably one of the worse offenders in this case – jumping from a criticism of the analysis to a criticism of the authors’ response to the critique. I know them both personally and felt they would know my comments were in the spirit of advancing the conversation, but that’s not a good excuse. I will be very careful not to do that in the future under any circumstances.

Pachter’s P-value Prize’s Post-Publication Peer-review Paradigm

12 Comments