Religious responses to COVID-19 as portrayed in a major news source raise the issue of conflict or cooperation between religious bodies and public health authorities. We compared articles in the New York Times relating to religion and COVID-19 with the COVID-19 statements posted on 63 faith-based organizations’ web sites, and with the guidance documents published by the Centers for Disease Control and Prevention (CDC) and World Health Organization (WHO) specifically for religious bodies. We used computational text analysis to identify and compare sentiments and topics in the three bodies of text. Sentiment analysis showed consistent positive values for faith-based organizations’ texts throughout the period. The initial negative sentiment of religion—COVID-19 coverage in the New York Times rose over the period and eventually converged with the consistently positive sentiment of faith-based documents. In our topic modelling analysis, rank order and regression analysis showed that topic prevalence was similar in the faith-based and public health sources, and both showed statistically significant differences from the New York Times. We conclude that there is evidence of both narratives and counter-narratives, and that these showed demonstrable shifts over time. Text analysis of public documents shows alignment of the interests of public health and religious bodies, which can be discerned for the benefit of communities if parties are trusted and religious messages are consistent with public health communications.
BACKGROUND Fatal drug overdose surveillance informs prevention but is often delayed due to autopsy report processing and death certificate coding. Autopsy reports contain narrative text describing scene evidence and medical history (similar to preliminary death scene investigation reports) and may serve as early data sources for identifying fatal drug overdoses. To facilitate more timely fatal overdose reporting, natural language processing (NLP) was applied to narrative text from autopsies. OBJECTIVE This study aimed to develop an NLP-based model predicting the likelihood that an autopsy report narrative describes an accidental or undetermined fatal drug overdose. METHODS Autopsies for all manners of death (2019-2021) were obtained from the Tennessee Office of the State Chief Medical Examiner. Text was extracted from autopsy reports (in portable document format files) using optical character recognition. Three common narrative text sections were identified, concatenated, and preprocessed (bag-of-words) with term frequency-inverse document frequency scoring. Logistic regression, support vector machine (SVM), random forest, and gradient boosted trees classifiers were developed and validated. Autopsies from 2019-2020 were used for training (95%) and calibration (5%), and 2021 for testing. Model discrimination was evaluated using area under the receiver operating characteristic (AUROC), precision, recall, F1 score, and F2 score (prioritizes recall over precision). Calibration was performed using logistic regression (Platt scaling) and evaluated using the Spiegelhalter z-test. Shapley Additive exPlanations (SHAP) values were generated for models compatible with the method. In a post-hoc subgroup analysis of the random forest classifier, model discrimination was evaluated by forensic center, race, and age at death. RESULTS A total of 17,342 autopsies (34% cases, 66% controls) were used for model development and validation. The training set included 10,215 autopsies (33% cases, 67% controls), calibration set had 538 autopsies (34% cases, 66% controls), and test set had 6,589 autopsies (37% cases, 63% controls). The vocabulary set contained 4,002 terms. All models showed excellent performance (AUROC ≥0.95, precision ≥0.94, recall ≥0.92, F1 ≥0.94, and F2 ≥0.92). The SVM and random forest classifiers achieved the highest F2 scores (SVM F2=0.948; random forest F2=0.947). The logistic regression and random forest were calibrated (P=.95 and P=.85 respectively), while the SVM and gradient boosted trees classifiers were miscalibrated (P=.029 and P<.001 respectively). “Fentanyl” and “accident” had the highest SHAP values. Post-hoc subgroup analyses revealed lower F2 scores for autopsy reports from forensic centers D and E. Lower F2 scores were also observed for the American Indian, Asian, ≤14, and ≥65 subgroups, but larger sample sizes are needed to validate these findings. CONCLUSIONS The random forest classifier may be suitable for identifying potential accidental and undetermined fatal overdose autopsies. Operationalizing this classifier could enable the early detection of accidental and undetermined fatal drug overdoses.
Background Fatal drug overdose surveillance informs prevention but is often delayed because of autopsy report processing and death certificate coding. Autopsy reports contain narrative text describing scene evidence and medical history (similar to preliminary death scene investigation reports) and may serve as early data sources for identifying fatal drug overdoses. To facilitate timely fatal overdose reporting, natural language processing was applied to narrative texts from autopsies. Objective This study aimed to develop a natural language processing–based model that predicts the likelihood that an autopsy report narrative describes an accidental or undetermined fatal drug overdose. Methods Autopsy reports of all manners of death (2019-2021) were obtained from the Tennessee Office of the State Chief Medical Examiner. The text was extracted from autopsy reports (PDFs) using optical character recognition. Three common narrative text sections were identified, concatenated, and preprocessed (bag-of-words) using term frequency–inverse document frequency scoring. Logistic regression, support vector machine (SVM), random forest, and gradient boosted tree classifiers were developed and validated. Models were trained and calibrated using autopsies from 2019 to 2020 and tested using those from 2021. Model discrimination was evaluated using the area under the receiver operating characteristic, precision, recall, F1-score, and F2-score (prioritizes recall over precision). Calibration was performed using logistic regression (Platt scaling) and evaluated using the Spiegelhalter z test. Shapley additive explanations values were generated for models compatible with this method. In a post hoc subgroup analysis of the random forest classifier, model discrimination was evaluated by forensic center, race, age, sex, and education level. Results A total of 17,342 autopsies (n=5934, 34.22% cases) were used for model development and validation. The training set included 10,215 autopsies (n=3342, 32.72% cases), the calibration set included 538 autopsies (n=183, 34.01% cases), and the test set included 6589 autopsies (n=2409, 36.56% cases). The vocabulary set contained 4002 terms. All models showed excellent performance (area under the receiver operating characteristic ≥0.95, precision ≥0.94, recall ≥0.92, F1-score ≥0.94, and F2-score ≥0.92). The SVM and random forest classifiers achieved the highest F2-scores (0.948 and 0.947, respectively). The logistic regression and random forest were calibrated (P=.95 and P=.85, respectively), whereas the SVM and gradient boosted tree classifiers were miscalibrated (P=.03 and P<.001, respectively). “Fentanyl” and “accident” had the highest Shapley additive explanations values. Post hoc subgroup analyses revealed lower F2-scores for autopsies from forensic centers D and E. Lower F2-score were observed for the American Indian, Asian, ≤14 years, and ≥65 years subgroups, but larger sample sizes are needed to validate these findings. Conclusions The random forest classifier may be suitable for identifying potential accidental and undetermined fatal overdose autopsies. Further validation studies should be conducted to ensure early detection of accidental and undetermined fatal drug overdoses across all subgroups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.