The COVID-19 pandemic has disrupted the seasonal patterns of several infectious diseases. Understanding when and where an outbreak may occur is vital for public health planning and response. We usually rely on well-functioning surveillance systems to monitor epidemic outbreaks. However, not all countries have a well-functioning surveillance system in place, or at least not for the pathogen in question. We utilized Google Trends search results for RSV-related keywords to identify outbreaks. We evaluated the strength of the Pearson correlation coefficient between clinical surveillance data and online search data and applied the Moving Epidemic Method (MEM) to identify country-specific epidemic thresholds. Additionally, we established pseudo-RSV surveillance systems, enabling internal stakeholders to obtain insights on the speed and risk of any emerging RSV outbreaks in countries with imprecise disease surveillance systems but with Google Trends data. Strong correlations between RSV clinical surveillance data and Google Trends search results from several countries were observed. In monitoring an upcoming RSV outbreak with MEM, data collected from both systems yielded similar estimates of country-specific epidemic thresholds, starting time, and duration. We demonstrate in this study the potential of monitoring disease outbreaks in real time and complement classical disease surveillance systems by leveraging online search data.
Pneumonia is the top communicable cause of death worldwide. Accurate prognostication of patient severity with Community Acquired Pneumonia (CAP) allows better patient care and hospital management. The Pneumonia Severity Index (PSI) was developed in 1997 as a tool to guide clinical practice by stratifying the severity of patients with CAP. While the PSI has been evaluated against other clinical stratification tools, it has not been evaluated against multiple classic machine learning classifiers in various metrics over large sample size. In this paper, we evaluated and compared the prediction performance of nine classic machine learning classifiers with PSI over 34720 adult (age 18+) patient records collected from 749 hospitals from 2009 to 2018 in the United States on Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) and Average Precision (Precision-Recall AUC). Machine learning classifiers, such as Random Forest, provided a significant improvement (~29% in PR AUC and ~5% in ROC AUC) compared to PSI and required only 7 input values (compared to 20 parameters used in PSI). There were also statistically significant differences (p<0.05) between Random Forest and PSI among various races/ethnicities. Because of its ease of use, PSI remains a very strong clinical decision tool, but machine learning classifiers can provide better prediction accuracy performance. Comparing prediction performance across multiple metrics such as PR AUC, instead of ROC AUC alone can provide additional insight.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.