There is a growing need to semantically process and integrate clinical data from different sources for clinical research. This paper presents an approach to integrate EHRs from heterogeneous resources and generate integrated data in different data formats or semantics to support various clinical research applications. The proposed approach builds semantic data virtualization layers on top of data sources, which generate data in the requested semantics or formats on demand. This approach avoids upfront dumping to and synchronizing of the data with various representations. Data from different EHR systems are first mapped to RDF data with source semantics, and then converted to representations with harmonized domain semantics where domain ontologies and terminologies are used to improve reusability. It is also possible to further convert data to application semantics and store the converted results in clinical research databases, e.g. i2b2, OMOP, to support different clinical research settings. Semantic conversions between different representations are explicitly expressed using N3 rules and executed by an N3 Reasoner (EYE), which can also generate proofs of the conversion processes. The solution presented in this paper has been applied to real-world applications that process large scale EHR data.
Depending mostly on voluntarily sent spontaneous reports, pharmacovigilance studies are hampered by low quantity and quality of patient data. Our objective is to improve postmarket safety studies by enabling safety analysts to seamlessly access a wide range of EHR sources for collecting deidentified medical data sets of selected patient populations and tracing the reported incidents back to original EHRs. We have developed an ontological framework where EHR sources and target clinical research systems can continue using their own local data models, interfaces, and terminology systems, while structural interoperability and Semantic Interoperability are handled through rule-based reasoning on formal representations of different models and terminology systems maintained in the SALUS Semantic Resource Set. SALUS Common Information Model at the core of this set acts as the common mediator. We demonstrate the capabilities of our framework through one of the SALUS safety analysis tools, namely, the Case Series Characterization Tool, which have been deployed on top of regional EHR Data Warehouse of the Lombardy Region containing about 1 billion records from 16 million patients and validated by several pharmacovigilance researchers with real-life cases. The results confirm significant improvements in signal detection and evaluation compared to traditional methods with the missing background information.
Background Machine learning algorithms are currently used in a wide array of clinical domains to produce models that can predict clinical risk events. Most models are developed and evaluated with retrospective data, very few are evaluated in a clinical workflow, and even fewer report performances in different hospitals. In this study, we provide detailed evaluations of clinical risk prediction models in live clinical workflows for three different use cases in three different hospitals. Objective The main objective of this study was to evaluate clinical risk prediction models in live clinical workflows and compare their performance in these setting with their performance when using retrospective data. We also aimed at generalizing the results by applying our investigation to three different use cases in three different hospitals. Methods We trained clinical risk prediction models for three use cases (ie, delirium, sepsis, and acute kidney injury) in three different hospitals with retrospective data. We used machine learning and, specifically, deep learning to train models that were based on the Transformer model. The models were trained using a calibration tool that is common for all hospitals and use cases. The models had a common design but were calibrated using each hospital’s specific data. The models were deployed in these three hospitals and used in daily clinical practice. The predictions made by these models were logged and correlated with the diagnosis at discharge. We compared their performance with evaluations on retrospective data and conducted cross-hospital evaluations. Results The performance of the prediction models with data from live clinical workflows was similar to the performance with retrospective data. The average value of the area under the receiver operating characteristic curve (AUROC) decreased slightly by 0.6 percentage points (from 94.8% to 94.2% at discharge). The cross-hospital evaluations exhibited severely reduced performance: the average AUROC decreased by 8 percentage points (from 94.2% to 86.3% at discharge), which indicates the importance of model calibration with data from the deployment hospital. Conclusions Calibrating the prediction model with data from different deployment hospitals led to good performance in live settings. The performance degradation in the cross-hospital evaluation identified limitations in developing a generic model for different hospitals. Designing a generic process for model development to generate specialized prediction models for each hospital guarantees model performance in different hospitals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.