The novel coronavirus, COVID-19, has sparked an outflow of scientific research seeking to understand the virus, its spread, and best practices in prevention and treatment. If this international research effort is going to be as swift and effective as possible, it will need to rely on a principle of open science. When researchers share data, code, and software and generally make their work as transparent as possible, it allows other researchers to verify and expand upon their work. Furthermore, it allows public officials to make informed decisions. In this study, we analyzed 535 preprint articles related to COVID-19 for eight transparency criteria and recorded study location and funding information. We found that individual researchers have lined up to help during this crisis, quickly tackling important public health questions, often without funding or support from outside organizations. However, most authors could improve their data sharing and scientific reporting practices. The contrast between researchers' commitment to doing important research and their reporting practices reveals underlying weaknesses in the research community's reporting habits, but not necessarily their science.
The identification and subsequent analysis of research articles for machine-learning and natural language processing is a complicated task given the lack of consistent article organization principles and heading naming conventions across publishers and journals. Given this, an understanding of how research articles organizationally follow a common function and their use of various heading terms, or forms, is a critical step in applying machine-learning techniques for data and information mining across a corpus of articles. To address this need, the authors developed and implemented an article heading form and function analysis across 12 publishers including both research articles and non research articles. Our aim was to: (1) Identify each of the labelled sections used by research articles, define these sections based on their rhetorical function, and determine frequency of use; (2) Within the given dataset, determine all of the alternative labels used to identify these sections; (3) Determine whether these sections can be used to consistently determine, a) whether an article is a true research article, or b) whether an article is not a research article. Results indicated wide variability in the organization of research articles with 24 common sections, known by 186 different names both within and across publishing houses.
Peer Review
https://publons.com/publon/10.1162/qss_a_00135
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.