Introduction
Machine learning (ML) for adverse drug reaction or event (ADR/ADE) prediction is emerging as the promising method for improving medical quality. We aimed to conduct a systematic review to comprehensively summarize ML prediction for ADR/ADE based on electronic health record (EHR).
Materials and methods
We systematically searched the PubMed, Web of Science, Embase, and IEEE Xplore databases from database inception to 21 Nov. 2023, to identify eligible studies. Any study that developed ML model to predict multiple ADR/ADE based on EHR was included in the final review. The pooled sensitivity, specificity and their 95% CI were calculated. Binary accuracy data were extracted for meta-analysis to derive area under the curve (AUC).
Results
5704 studies were identified, of which 10 studies met the inclusion criteria. Among the 20 ML methods reported in the including studies, Random Forest (RF) was reported the most (n=9), followed by Adaboost (n=4), eXtreme Gradient Boosting (n=3) and support vector machine (n=3). The mean AUC for ML prediction was 75.71% (26.00-94.57). RF combined with resampling based approaches might get high AUC, the mean was 82.92% (94.48-94.57). The length of stay, number of drugs, admission type, age, and high-risk drug used, such as antiviral agents, rifamycin, were the common risk factors for ADR/ADE prediction. The pooled estimated AUC of summary receiver operator characteristics was 72.00% (68.00-75.00).
Conclusions
Acceptable prediction performance of ADR/ADE with ML algorithmwas highlighted. More rigorous reporting standards and the new ML methods that take into account the unique challenges of ML research could improve future studies and help the application of ML models in clinical practice.