OHDSI, a fast growing open-science research community seeks to enable researchers from around the globe to conduct network studies based on standardized data and vocabularies. There is no comprehensive review of publications about OHDSI’s standard: the OMOP Common Data Model and its usage available. In this work we aim to close this gap and provide a summary of existing publications including the analysis of its meta information such as the choice of journals, journal types, countries, as well as an analysis by topics based on a title and abstract screening. Since 2016, the number of publications has been constantly growing and the relevance of the OMOP CDM is increasing in terms of multi-country studies based on observational patient data.
Background Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites.
Objectives Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium.
Methods Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR.
Results The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats.
Conclusion The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.