Background: Given the worldwide spread of the 2019 Novel Coronavirus (COVID-19), there is an urgent need to identify risk and protective factors and expose areas of insufficient understanding. Emerging tools, such as the Rapid Evidence Map (rEM), are being developed to systematically characterize large collections of scientific literature. We sought to generate an rEM of risk and protective factors to comprehensively inform areas that impact COVID-19 outcomes for different sub-populations in order to better protect the public.Methods: We developed a protocol that includes a study goal, study questions, a PECO statement, and a process for screening literature by combining semi-automated machine learning with the expertise of our review team. We applied this protocol to reports within the COVID-19 Open Research Dataset (CORD-19) that were published in early 2020. SWIFT-Active Screener was used to prioritize records according to pre-defined inclusion criteria. Relevant studies were categorized by risk and protective status; susceptibility category (Behavioral, Physiological, Demographic, and Environmental); and affected sub-populations. Using tagged studies, we created an rEM for COVID-19 susceptibility that reveals: (1) current lines of evidence; (2) knowledge gaps; and (3) areas that may benefit from systematic review.Results: We imported 4,330 titles and abstracts from CORD-19. After screening 3,521 of these to achieve 99% estimated recall, 217 relevant studies were identified. Most included studies concerned the impact of underlying comorbidities (Physiological); age and gender (Demographic); and social factors (Environmental) on COVID-19 outcomes. Among the relevant studies, older males with comorbidities were commonly reported to have the poorest outcomes. We noted a paucity of COVID-19 studies among children and susceptible sub-groups, including pregnant women, racial minorities, refugees/migrants, and healthcare workers, with few studies examining protective factors.Conclusion: Using rEM analysis, we synthesized the recent body of evidence related to COVID-19 risk and protective factors. The results provide a comprehensive tool for rapidly elucidating COVID-19 susceptibility patterns and identifying resource-rich/resource-poor areas of research that may benefit from future investigation as the pandemic evolves.
Curated databases of scientific literature play an important role in helping researchers find relevant literature, but populating such databases is a labour intensive and time-consuming process. One such database is the freely accessible Comet Core Outcome Set database, which was originally populated using manual screening in an annually updated systematic review. In order to reduce the workload and facilitate more timely updates we are evaluating machine learning methods to reduce the number of references needed to screen. In this study we have evaluated a machine learning approach based on logistic regression to automatically rank the candidate articles. Data from the original systematic review and its four first review updates were used to train the model and evaluate performance. We estimated that using automatic screening would yield a workload reduction of at least 75% while keeping the number of missed references around 2%. We judged this to be an acceptable trade-off for this systematic review, and the method is now being used for the next round of the Comet database update.
Systematic reviews are important in evidence based medicine, but are expensive to produce. Automating or semi-automating the data extraction of index test, target condition, and reference standard from articles has the potential to decrease the cost of conducting systematic reviews of diagnostic test accuracy, but relevant training data is not available. We create a distantly supervised dataset of approximately 90,000 sentences, and let two experts manually annotate a small subset of around 1,000 sentences for evaluation. We evaluate the performance of BioBERT and logistic regression for ranking the sentences, and compare the performance for distant and direct supervision. Our results suggest that distant supervision can work as well as, or better than direct supervision on this problem, and that distantly trained models can perform as well as, or better than human annotators.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.