Introduction and ObjectiveSocial media has been proposed as a possibly useful data source for pharmacovigilance signal detection. This study primarily aimed to evaluate the performance of established statistical signal detection algorithms in Twitter/Facebook for a broad range of drugs and adverse events.MethodsPerformance was assessed using a reference set by Harpaz et al., consisting of 62 US Food and Drug Administration labelling changes, and an internal WEB-RADR reference set consisting of 200 validated safety signals. In total, 75 drugs were studied. Twitter/Facebook posts were retrieved for the period March 2012 to March 2015, and drugs/events were extracted from the posts. We retrieved 4.3 million and 2.0 million posts for the WEB-RADR and Harpaz drugs, respectively. Individual case reports were extracted from VigiBase for the same period. Disproportionality algorithms based on the Information Component or the Proportional Reporting Ratio and crude post/report counting were applied in Twitter/Facebook and VigiBase. Receiver operating characteristic curves were generated, and the relative timing of alerting was analysed.ResultsAcross all algorithms, the area under the receiver operating characteristic curve for Twitter/Facebook varied between 0.47 and 0.53 for the WEB-RADR reference set and between 0.48 and 0.53 for the Harpaz reference set. For VigiBase, the ranges were 0.64–0.69 and 0.55–0.67, respectively. In Twitter/Facebook, at best, 31 (16%) and four (6%) positive controls were detected prior to their index dates in the WEB-RADR and Harpaz references, respectively. In VigiBase, the corresponding numbers were 66 (33%) and 17 (27%).ConclusionsOur results clearly suggest that broad-ranging statistical signal detection in Twitter and Facebook, using currently available methods for adverse event recognition, performs poorly and cannot be recommended at the expense of other pharmacovigilance activities.Electronic supplementary materialThe online version of this article (10.1007/s40264-018-0699-2) contains supplementary material, which is available to authorized users.
Introduction and Objective Social media has been suggested as a source for safety information, supplementing existing safety surveillance data sources. This article summarises the activities undertaken, and the associated challenges, to create a benchmark reference dataset that can be used to evaluate the performance of automated methods and systems for adverse event recognition. Methods A retrospective analysis of public English-language Twitter posts (Tweets) was performed. We sampled 57,473 Tweets out of 5,645,336 Tweets created between 1 March, 2012 and 1 March, 2015 that mentioned at least one of six medicinal products of interest (insulin glargine, levetiracetam, methylphenidate, sorafenib, terbinafine, zolpidem). Products, adverse events, indications, product-event combinations, and product-indication combinations were extracted and coded by two independent teams of safety reviewers. Results The benchmark reference dataset consisted of 1056 positive controls ("adverse event Tweets") and 56,417 negative controls ("non-adverse event Tweets"). The 1056 adverse event Tweets contained 1396 product-event combinations referring to personal adverse event experiences, comprising 292 different MedDRA ® Preferred Terms. The 1171 product-event combinations (83.9%) were confined to four MedDRA ® System Organ Classes. The 195 Tweets (18.5%) contained indication information, comprising 25 different Preferred Terms. Conclusions A manually curated benchmark reference dataset based on Twitter data has been created and is made available to the research community to evaluate the performance of automated methods and systems for adverse event recognition in unstructured free-text information.
Introduction Statistical signal detection is a crucial tool for rapidly identifying potential risks associated with pharmaceutical products. The unprecedented environment created by the coronavirus disease 2019 (COVID-19) pandemic for vaccine surveillance predisposes commonly applied signal detection methodologies to a statistical issue called the masking effect, in which signals for a vaccine of interest are hidden by the presence of other reported vaccines. This masking effect may in turn limit or delay our understanding of the risks associated with new and established vaccines. Objective The aim is to investigate the problem of masking in the context of COVID-19 vaccine signal detection, assessing its impact, extent, and root causes. Methods Based on data underlying the Vaccine Adverse Event Reporting System, three commonly applied statistical signal detection methodologies, and a more advanced regression-based methodology, we investigate the temporal evolution of signals corresponding to five largely recognized adverse events and two potentially new adverse events. Results The results demonstrate that signals of adverse events related to COVID-19 vaccines may be undetected or delayed due to masking when generated by methodologies currently utilized by pharmacovigilance organizations, and that a class of advanced methodologies can partially alleviate the problem. The results indicate that while masking is rare relative to all possible statistical associations, it is much more likely to occur in COVID-19 vaccine signaling, and that its extent, direction, impact, and roots are not static, but rather changing in accordance with the changing nature of data. Conclusions Masking is an addressable problem that merits careful consideration, especially in situations such as COVID-19 vaccine safety surveillance and other emergency use authorization products. Supplementary Information The online version contains supplementary material available at 10.1007/s40264-022-01186-z.
Introduction Signal validation in pharmacovigilance is the process of evaluating data to decide whether evidence is sufficient to justify further assessment of a detected signal. During the signal validation process, safety experts in our organization are required to review signals of disproportionate reporting (SDRs) and classify them into one of six predefined categories. Objective This experiment explored the extent to which predictive machine learning (ML) models can support the decision making of safety experts by accurately identifying the most appropriate predefined signal validation category. Methods We extracted cumulative data for six medicinal products, consisting of historic SDR validations and Individual Case Safety Reports, from the company’s safety database for training and testing of the ML model. We implemented a decision tree-based supervised multiclass classifier model termed Gradient Boosted Trees followed by a SHapley Additive exPlanations (SHAP) analysis to mitigate the “black box” effect of the ensemble model by identifying the key predicting features in the model. Following a retrospective analysis, a prospective experiment was conducted to test the model accuracy and user acceptance in a real-life setting. Results The prediction accuracy of our ML model ranged from 83 to 86% over 3 months for the six medicinal products. The applicability of the model was confirmed by the company’s safety experts. Additionally, the systematic predictions provided valuable information to the safety experts and assisted them in reviewing the SDRs efficiently and consistently. Conclusions This experiment demonstrated that it is possible to train a multiclass classification model to accurately predict signal validation categories for SDRs. More importantly, the transparency of the predictions provided by the SHAP analysis led to high acceptance by the safety experts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.