COVID-19 has caused worldwide anxiety and thousands of deaths. Each day, new data and research are published. With the high level of activity and collaboration worldwide, what gets published today may be outdated the week after. How do researchers, medical professionals, policymakers, and the public keep up to date and informed?
At scite, we have created an easy way for anyone to see how a scientific article has been cited and, specifically, if it has been supported or contradicted by subsequent research. We do this by analyzing millions of full-text publications, extracting the citation statements from these publications, and then classifying these as supporting or contradicting evidence.
Recently, to help the world make more sense of COVID-19 research, we turned our attention and novel functionality to COVID-19 papers and preprints.
In order to analyze research on COVID-19 and coronavirus in general, we identified relevant publications using the CORD-19 dataset, a list of papers and preprints compiled from a variety of publishers and databases. From this list, we were able to download 20,268 PDFs from publishers. After processing these documents, we found that 16,775 had citation statements (and references) we could extract, amounting to 1,266,672 citation statements in total. We then applied our deep learning model to identify citation statements as supporting, contradicting, or mentioning and added these to our database to make them discoverable. We’ve also released all citation tallies openly and citations statements from open documents on Zenodo.
To show the utility of scite when looking at COVID-19 papers, let’s take a single example report — “Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia” — published in the New England Journal of Medicine.
Could the reported behavior of the virus in the Wuhan population be replicated elsewhere? Looking at the article on the publisher’s website, it’s unclear and finding out would require someone to read each paper citing it, a massive amount of work.
With scite, we’ve made it easy to see how this article has been cited (Figure 1). The approach scite uses found the article to be supported nine times.
To just name a few:
About 4 studies contradicted aspects of the article:
Aside from identifying citations that make a claim about an article, scite offers many other features. Users can search through all citation contexts, authors, and titles using the search field in the top right-hand corner (Figure 2). This can, for example, be useful if researchers want to find other modeling studies using the epidemiological data reported in the NEJM paper. Users can easily find 31 citation statements featuring the word “model” in the title, in the citation context itself or in the section header in the paper where the citation originated from!
We believe the recent coronavirus outbreak is a global challenge that can only be tackled by collaboration. By continuously analyzing new COVID-19 papers, we hope to keep the scientific community up to date with Smart Citations — citations that display the context of the citation and describe whether the paper provides supporting or contradicting evidence — on research that can make a difference. We will update the Zenodo dataset regularly and widely share new versions as they become available.
To allow for more seamless integration into scientists’ workflow, scite offers a free plugin for Chrome and Firefox. With this plugin, users can see Smart Citations with citation counts from scite on every website with a scientific paper.
Let us know about any features missing–our team of developers is working day in and day out to make scite more and more useful.