A Preprint Goes Viral. What Happens Next?

Cold Spring Harbor Laboratory’s bioRxiv preprint server hit a milestone these last few days with the posting of a preprint on the link between cell phone radiation and tumors in rats. CSHL and HighWire – who hosts the bioRxiv manuscript system and preprint server for CSHL – watched the traffic and commentary on the paper climb.   In under 24 hours, the paper had ‘scored’ in the top 500 of over 5 million articles in terms of social media. In two days it had over 50,000 PDF downloads; and four days after posting it at about 90,000 downloads and in the top 215. And this happened on a traditionally “slow news period” of a US and UK holiday.   What did this accomplish?

I believe this is a watershed event that raises industry awareness that preprints are now in the workflows of authors, and that a preprint can create as large a wave of recognition and access as a journal publication can (no matter what your opinion of the science is).

For those who enjoy horse races, the article-level metrics and Altmetrics displays are impressive, along with 116 comments logged on bioRxiv within 5 days online:



Since the Royal Society meetings over a year ago on the future of scholarly communication, I have been saying that preprints are the next leg of digital transformation of scholarly communication – with online journals(1) and scholarly-specialized search engines(2) being the first two legs.  Industry identifiers such as DOI and ORCID, open access journals and data, and megajournals, are other legs. (This is a centipede we have here, with the number of legs not yet determined!)

This past week we could see the beginning of how all of this plays out in a single paper, a preprint. Because of the public-health-policy nature of the topic, we saw not only distribution, but virality. Virality – where recipients are themselves distributors – is well-enabled by the open access nature of preprints.

Previously, I had noted that research with public policy implications is advantaged by being placed in an open access venue, such as the OA journals that many societies have launched. Because preprints are open access, public-policy-related preprints are similarly advantaged. OA provides access to policy makers, beyond the usual research audience, and having the data and research-narrative accessible should lead to the possibility of better-grounded policy debate. We could see an OA-for-public-policy strategy at work in Science Advances with its launch issue containing articles on climate change and bias in faculty hiring.

Though the debate on the pros and cons of preprints in life sciences continues, it seems to be moving in only one direction(3).   The key “pro” has been to put distribution ahead of evaluation, and to allow experts in a domain to judge quality and significance for themselves, rather than have a few peer-reviewers judge for all the community, both expert and not.   The thinking is that by “flipping” the position of distribution and review, you give the experts access, while still subsequently providing the non-experts the advantage of peer-review. Today’s Review-Publish-Distribute model can be drawn like this:

Preprints “flip” the scholarly commons, so that the author enables distribution to “many interested experts” (as well as the potential mass audience), at which point editors from multiple journals can engage the authors to pull the content into their journals.  The flipped preprint workflow of Distribute-Publish-Review can be depicted like this:

In HighWire’s researcher interviews, we heard that when researchers are reading as experts, ; but when they are reading outside their area of expertise, they rely on journals and brands.   So the flipped preprints model makes sense for the expert in each of us. But we will likely continue to read journals as well! Both models co-exist because we all must read outside our areas of expertise.

Preprints change so many things, that it is inconceivable to me that the life-science publishing ecosystem will be structurally unaltered if preprints are taken up en masse:

  • The control point for distribution is changed: the author decides when.
  • Editors may seek out material online in preprint servers, as they now do in scholarly meetings, rather than (only) by unsolicited submissions.
  • The debate about quality and interpretation will happen before journal publication (as well as after, of course).
  • In some fields, Academic priority will be established by preprint posting.
  • Once priority goes with a preprint, then preprint servers will attenuate the fear of being “scooped”.
  • Some types of research results (confirmatory findings, e.g.) may shift from journals to preprint servers, which might in turn relieve some reviewer burden.
  • Indexing systems and repositories will need to incorporate preprints, or could fall out of place in the researchers’ (but not students’) workflow.
  • Evaluation systems will need to calibrate for preprints.
  • Our vocabulary will need to recognize the difference between a paper that is posted as a preprint vs. an article that is published in a journal – or will future generations call them both “published”?

If you can think of other shifts that preprints will bring about, please post a comment!

How fast can a preprint post?
This particular preprint went from submission to posting in under 5 hours.   At bioRxiv, time-to-post varies from 4-48 hours, depending on volume, time of day and day of the week. On weekdays most will process in 6-18 hours with the longer time on weekends.   So this particular paper was not an outlier, but was at the shorter end of the spectrum, according to Richard Sever, Assistant Director at Cold Spring Harbor Laboratory Press, and Co-founder of bioRxiv.

How long from post to publication?
This will vary hugely, of course, since it depends on when and where submission happens, and the editorial and revision process of the receiving journal.   The median at bioRxiv right now is 157 days – about 5 months – with a very large range, according to John Inglis, Executive Director and Publisher, Cold Spring Harbor Laboratory Press, and Co-founder of bioRxiv.

Will preprints establish priority in life sciences?
Here I will rely on Richard Severs’ sense of the the scientific zeitgeist: “Yes – remember this was an explicit goal of arXiv.  But important judgments will be made retrospectively – just because someone claims they have done something doesn’t mean the paper actually demonstrates this.” John Inglis comments further that this will likely vary by discipline for a while as things sort themselves out: “there is concern that someone can claim priority of discovery with a badly executed study that happens to reach a conclusion that subsequently turns out to be correct.  I don’t know how this issue will be sorted out: perhaps it will just take time and eventual domination of a discipline by young scholars who have grown up with preprints.” Disciplines dominated by publications reporting on models and algorithms may already see priority established by a preprint, while those that report on experimental data and interpretation may take more time.

I believe that once priority is accepted as established by preprints, the other changes in the ecosystem mentioned above will be induced.   Just as the introduction of “megajournals” caused shifting patterns in submission, preprints will cause shifts there and at other points across the ecosystem. It remains to be seen whether the sorting-out that was seen by physics during its 20+ years of preprint experience (“preprints are for priority, journals are for promotion” for example seems to be understood in the physics community) will foretell the same results in other sciences. Or will the later start for preprints in other sciences stand on the shoulders of 20 years of digital transformation, and so most likely have a different ending point.  Preprints today start with the context of the well-established legs of the centipede I mentioned at the beginning of this piece, plus Web 2.0 and social media in the consumer space, plus the public interest in biomedical advances. The same storyline might have a different ending.

  1. Circa 1995 – 21 years ago – online journals were launched, by HighWire and by several large publishers.
  2. Scholarly search engine examples are PubMed – when it became freely available in the late 1990s – and Google Scholar – which launched in 2004.
  3. Berg, J. et al. “Preprints for the life sciences: The time is right for biologists to post their research findings onto preprint servers”. Science, v 352, p 899, 20 May 2016.

Latest news and blog articles