Contrary to what you read in Nature, Open Access has not caused the growth in science publishing

I wasn’t planning on spending my Thanksgiving delving into PubMed statistics to refute yet another bogus claim about open access publishing. But being a vegan, I didn’t really have much else to do anyway. So…

The newest Nature has an Op-Ed from Martijn Arns, a brain researcher in the Netherlands with a title I couldn’t ignore: “Open access is tiring out peer reviewers“. He complains about poor quality (not negative) reviews he’s gotten on some recent papers, and asserts that this is due to an increase in the burden on peer reviewers with the rise of digital publishing. I agree with him – there are deep problems with the way we go about peer review – and his solution to the problem – implement post-publication peer review – is spot on.

But as the trolly title of the article (which I suspect was added by Nature and not by Arns) suggests, Arns argues that the increase in publishing volume, and therefore reviewer workload, is due to the rise of open access. It is, of course true, that open access publishing has been growing rapidly, and thus it might seem to some people that the growth in scientific publishing overall is due primarily to open access. But impressions can often be wrong, so I decided to look at some data.

I used PubMed to get data on the total number of papers published annually since 2000 and PubMed Central to determine the fraction of these articles that are open access. (The numbers aren’t perfect, since some journals have made their back content open access, slightly inflating the number of open access articles from early years. But this is a small effect.)

Year	PubMed	Open Access	Fraction OA
2000	529871	3438	0.006
2001	544778	4098	0.008
2002	562122	4553	0.008
2003	592435	5489	0.009
2004	637474	7532	0.012
2005	697839	11856	0.017
2006	744503	15065	0.020
2007	782502	20468	0.026
2008	831834	32576	0.039
2009	872229	59048	0.068
2010	935583	81854	0.087
2011	1010990	111910	0.111
2012	1073158	146561	0.137
2013	1130554	179871	0.159

You can see several things in these data. First, the number of papers in PubMed has increased dramatically with more than 2.1 times as many papers published in 2013 as in 2000. At the same time the fraction of articles in PubMed that are open access has increased even more dramatically, from basically nothing in 2000 to 15.9% of articles in PubMed in 2013.

Interestingly, in the last few years, the annual grown in open access papers has exceeded the annual growth in non-open access papers. The graph below plots, for each year, the annual growth in open access (the number of open access papers published in that year minus the number of open access papers published in the previous year) as a fraction of the annual growth in papers published (the total number of papers published in that year minus the total number of papers published in the previous year).

$OA as fraction of total new papers$

(Not sure what’s going on with 2009 – something is weird with that datapoint).

But does this mean that open access is driving the increase in scientific publishing output? Of course not. Just because open access is capturing market share doesn’t mean that it is driving an overall increase in the size of the market. If it were, than you would expect the annual increase in the number of papers published to be increasing as open access has risen. But this is not the case. The number of papers published has been increasing at roughly 6% since for the last 10 years, with no relationship to the number of open access papers published, which has been increasing steadily every year.

Thus we are really looking at two essentially independent phenomena. There is clearly a rise in the total number of papers published. And open access has been capturing an increasing fraction of this growing market. But it is simply false to assert that open access has been driving this increase in scientific output.

That said, I agree completely with Arms that there is a big problem here. It IS a huge burden to try and find reviewers for all of these paper. And more importantly, this process has gotten slower and slower. The real failure of digital publishing is not that the number of papers has increased – it’s that the time it takes to publish papers has not improved at all. It’s ridiculous that we live in a world where it is possible to share information across the globe instantaneously, but that science as an enterprise has chosen to delay the sharing of new scientific knowledge for an average of 9 months as we go through a byzantine process of pre-publication peer review.

We need to – as Arms suggests (and I have written about previously) move to post-publication peer review. But in doing so we need to focus on the problem – peer-review. And in order to do this, we have to shift away from paywalled, subscription journals which depend entirely on pre-publication peer review to justify their existence. Thus, rather than mistakenly blaming open access for creating problems with peer review, we have to recognize that it is an important part of the solution.

Contrary to what you read in Nature, Open Access has not caused the growth in science publishing

11 Comments