BACKGROUND
Artificial Intelligence (AI) on real-world data (RWD) (e.g., Electronic Healthcare Records – EHR) has been identified as a potentially promising technical paradigm for the pharmacovigilance (PV) field. There are several applications of AI approaches on RWD, however, most of the studies focus on unstructured RWD, i.e., conducting Natural Language Processing (NLP) on various data sources (e.g., clinical notes, social media, blogs, etc.). Hence, it is essential to investigate how AI is already applied to structured RWD in PV and how new approaches could enrich the existing methodology.
OBJECTIVE
This manuscript provides a Systematic Literature Review (SLR) depicting the emerging use of AI upon structured RWD for PV purposes to identify relevant trends and potential research gaps.
METHODS
The presented SLR methodology is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology/rationale. Relevant scientific manuscripts were retrieved by PubMed on January 31, 2024. The included studies were “mapped” against a set of evaluation criteria, including applied AI approaches, code availability, description of data preprocessing pipeline, implementation of trustworthy AI criteria, and clinical validation of AI models.
RESULTS
The systematic literature review finally yielded 36 studies. There has been a significant increase in studies after 2019. Most of the articles focus on Adverse Drug Reaction (ADR) detection procedures (64%) for specific adverse effects. Furthermore, a significant number of studies (>90%) used non-symbolic AI approaches (Machine Learning – ML and Deep Learning - DL) emphasizing classification tasks. Random forest is the most popular ML approach in this review (47%). The most common RWD sources used are the EHRs (78%). Typically, these data are not available in a widely acknowledged data model to facilitate interoperability and they come from proprietary databases; thus, they are not available to reproduce results. Based on the evaluation criteria classification, 10% of the studies published their code in public registries, 16% of them tested their AI models in clinical environments and 36% of them provided information about the data preprocessing pipeline. Additionally, in terms of trustworthy AI, 89% of the articles follow at least half of the FUTURE AI initiative guidelines.
CONCLUSIONS
Artificial intelligence, along with structured real-world data, constitutes a new and promising line of work for drug safety and PV. However, in terms of AI, some approaches haven’t been examined extensively in this field (like Explainable AI and Causal AI). Moreover, it would be helpful to have a data preprocessing protocol for real-world data to support pharmacovigilance processes. Finally, because of personal data sensitivity, evaluation procedures have to be investigated further.