Introduction Adverse event (AE) under-reporting has been a recurrent issue raised during health authorities Good Clinical Practices (GCP) inspections and audits. Moreover, safety under-reporting poses a risk to patient safety and data integrity. The current clinical Quality Assurance (QA) practices used to detect AE under-reporting rely heavily on investigator site and study audits. Yet several sponsors and institutions have had repeated findings related to safety reporting, and this has led to delays in regulatory submissions. Recent developments in data management and IT systems allow data scientists to apply techniques such as machine learning to detect AE under-reporting in an automated fashion. Objective In this project, we developed a predictive model that enables Roche/Genentech Quality Program Leads oversight of AE reporting at the program, study, site, and patient level. This project was part of a broader effort at Roche/Genentech Product Development Quality to apply advanced analytics to augment and complement traditional clinical QA approaches. Method We used a curated data set from 104 completed Roche/Genentech sponsored clinical studies to train a machine learning model to predict the expected number of AEs. Our final model used 54 features built on patient (e.g., demographics, vitals) and study attributes (e.g., molecule class, disease area). Results In order to evaluate model performance, we tested how well it would detect simulated test cases based on data not used for model training. For relevant simulation scenarios of 25%, 50%, and 75% under-reporting on the site level, our model scored an area under the curve (AUC) of the receiver operating characteristic (ROC) curve of 0.62, 0.79, and 0.92, respectively. Conclusion The model has been deployed to evaluate safety reporting performance in a set of ongoing studies in the form of a QA/dashboard cockpit available to Roche Quality Program Leads. Applicability and production performance will be assessed over the next 12–24 months in which we will develop a validation strategy to fully integrate our model into Roche QA processes. Electronic supplementary material The online version of this article (10.1007/s40264-019-00831-4) contains supplementary material, which is available to authorized users.
Background The increasing number of clinical trials and their complexity make it challenging to detect and identify clinical quality issues timely. Despite extensive sponsor audit programs and monitoring activities, issues related to data integrity, safety, sponsor oversight and patient consent have recurring audit and inspection findings. Recent developments in data management and IT systems allow statistical modeling to provide insights to clinical Quality Assurance (QA) professionals to help mitigate some of the key clinical quality issues more holistically and efficiently. Methods We used findings from a curated data set from Roche/Genentech operational and quality assurance study data, covering a span of 8 years (2011-2018) and grouped them into 5 clinical impact factor categories, for which we modeled the risk with a logistic regression using hand crafted features. Results We were able to train 5 interpretable, cross-validated models with several distinguished risk factors, many of which confirmed field observations of our quality professionals. Our models were able to reliably predict a decrease in risk by 12-44%, with 2-8 coefficients each, despite a low signal-to-noise ratio in our data set. Conclusion We proposed a modeling strategy that could provide insights to clinical QA professionals to help them mitigate key clinical quality issues (e.g., safety, consent, data integrity) in a more sustained data-driven way, thus turning the traditional reactive approach to a more proactive monitoring and alerting approach. Also, we are calling for cross-sponsors collaborations and data sharing to improve and further validate the use of statistical models in clinical QA.
Dear Editor, In a previous project [1], we developed a predictive model that enabled Roche/Genentech quality leads oversight of adverse event (AE) reporting. External clinical trial datasets such as Project Data Sphere (PDS) [2] allowed us to further test our machine learning-based approach to alleviate concerns of overfitting and to demonstrate the reproducibility of our research.Our primary objective was to further validate our model for detection of AE under-reporting using PDS data. Our secondary objective was to build an oncology-specific model using a combined dataset of Roche and PDS data. The scope remained as predicting AEs-not adverse drug reactionsthat occur in clinical trials. Good clinical practice requires all AEs (regardless of the causal relationship between the drug intake and the events) to be reported in a timely manner [3].The curation process of downloadable PDS studies (as of November 2019) left five studies that fulfilled our data requirements, as sponsors are not required to share the full datasets. They were large phase III trials and included 742 investigator sites, 2363 subjects, and 51,847 visits. Hence, we could use PDS data to achieve our objectives.The oncology-specific model was built using the methodology described in our previous manuscript [1]. We used a combined dataset of 53 completed oncology studies (Roche + PDS). Our final model used 38 features built from patient and study attributes.To test whether our model can be applied to non-Roche studies, we compared the quality of the predictions using a scatter plot ( Fig. 1a) and found that, within a range of 0-150 on both axes (> 94% of all datapoints, our region of interest [ROI]), the predictions matched the observed values for both datasets equally well. To quantify the goodness of fit, we used scale-independent performance metrics (which are adequate for comparing the goodness of fit of different datasets used by the same model [4]): symmetric mean absolute percentage error (SMAPE) [5] and symmetric mean absolute poisson significance level (SMASL). The latter is calculated by subtracting 0.5 from each poisson significance level measurement, converting it to its absolute value, and taking the mean. SMASL puts equal weight on over-and under-predicting and has a range from 0 to 0.5 (i.e., The smaller the value the better the fit). Considering SMAPE, average predictions for the PDS study sites were slightly better than for the Roche study sites, whereas the reverse was true for SMASL ( Fig. 1b). We concluded that the goodness of fit for both datasets using our model was very similar within the ROI.For the secondary objective, we tested how well the oncology model (using Roche and PDS data and the same algorithm [1]) would detect simulated test cases on data not used for model training. For relevant simulation scenarios of 25%, 50%, and 75% under-reporting on the site level, our model scored an area under the curve (AUC) of the receiver operating characteristic curve of 0.60, 0.77, and 0.90, respectively. These AUC values were on...
Background The European Medicines Agency Good Pharmacovigilance Practices (GVP) guidelines provide a framework for pharmacovigilance (PV) audits, including limited guidance on risk assessment methods. Quality assurance (QA) teams of large and medium sized pharmaceutical companies generally conduct annual risk assessments of the PV system, based on retrospective review of data and pre-defined impact factors to plan for PV audits which require a high volume of manual work and resources. In addition, for companies of this size, auditing the entire “universe” of individual entities on an annual basis is generally prohibitive due to sheer volume. A risk assessment approach that enables efficient, temporal, and targeted PV audits is not currently available. Methods In this project, we developed a statistical model to enable holistic and efficient risk assessment of certain aspects of the PV system. We used findings from a curated data set from Roche operational and quality assurance PV data, covering a span of over 8 years (2011–2019) and we modeled the risk with a logistic regression on quality PV risk indicators defined as data stream statistics over sliding windows. Results We produced a model for each PV impact factor (e.g. 'Compliance to Individual Case Safety Report') for which we had enough features. For PV impact factors where modeling was not feasible, we used descriptive statistics. All the outputs were consolidated and displayed in a QA dashboard built on Spotfire®. Conclusion The model has been deployed as a quality decisioning tool available to Roche Quality professionals. It is used, for example, to inform the decision on which affiliates (i.e. pharmaceutical company commercial entities) undergo audit for PV activities. The model will be continuously monitored and fine-tuned to ensure its reliability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.