Motivation Record Linkage has versatile applications in real-world data analysis contexts, where several data sets need to be linked on the record level in the absence of any exact identifier connecting related records. An example are medical databases of patients, spread across institutions, that have to be linked on personally identifiable entries like name, date of birth or ZIP code. At the same time, privacy laws may prohibit the exchange of this personally identifiable information (PII) across institutional boundaries, ruling out the outsourcing of the record linkage task to a trusted third party. We propose to employ privacy-preserving record linkage (PPRL) techniques that prevent, to various degrees, the leakage of PII while still allowing for the linkage of related records. Results We develop a framework for fault-tolerant PPRL using secure multi-party computation with the medical record keeping software Mainzelliste as the data source. Our solution does not rely on any trusted third party and all PII is guaranteed to not leak under common cryptographic security assumptions. Benchmarks show the feasibility of our approach in realistic networking settings: linkage of a patient record against a database of 10.000 records can be done in 48s over a heavily delayed (100ms) network connection, or 3.9s with a low-latency connection. Availability and implementation The source code of the sMPC node is freely available on Github at https://github.com/medicalinformatics/SecureEpilinker subject to the AGPLv3 license. The source code of the modified Mainzelliste is available at https://github.com/medicalinformatics/MainzellisteSEL.
Medical research and treatments rely increasingly on genomic data. Queries on so-called variants are of high importance in, e.g., biomarker identification and general disease association studies. However, the human genome is a very sensitive piece of information that is worth protecting. By observing queries and responses to classical genomic databases, medical conditions can be inferred. The Beacon project is an example of a public genomic querying service, which undermines the privacy of the querier as well as individuals in the database. By secure outsourcing via secure multi-party computation (SMPC), we enable privacy-preserving genomic database queries that protect sensitive data contained in the queries and their respective responses. At the same time, we allow for multiple genomic databases to combine their datasets to achieve a much larger search space, without revealing the actual databases' contents to third parties. SMPC is generic and allows to apply further processing like aggregation to query results. We measure the performance of our approach for realistic parameters and achieve convincingly fast runtimes that render our protocol applicable to real-world medical data integration settings. Our prototype implementation can process a private query with 5 genetic variant conditions against a person's exome with 100,000 genomic variants in less than 180 ms online runtime, including additional range and equality checks for auxiliary data.
Gromacs is one of the most popular molecular simulation suites currently available. In this contribution we present streaMD, the first interface between Gromacs trajectory files and the statistical language R. The amount of data created due to ever increasing computational power renders fast and efficient analysis of trajectories into a challenge. Especially as standard approaches such as root-mean square fluctuations and the like provide only limited physical insight. In our streaMD package integration of the Gromacs I/O libraries with advanced, graph-based analysis methods as the java library Stream leads to both: improved speed and analysis depth. We benchmark our results and highlight the applicability of the package by an interesting problem in RNA design, namely the interaction of tetracycline with an aptamer. © 2018 Wiley Periodicals, Inc.
Genomic data is crucial in the understanding of many diseases and for the guidance of medical treatments. Pharmacogenomics and cancer genomics are just two areas in precision medicine of rapidly growing utilization. At the same time, whole-genome sequencing costs are plummeting below $ 1000, meaning that a rapid growth in full-genome data storage requirements is foreseeable. While privacy protection of genomic data is receiving growing attention, integrity protection of this long-lived and highly sensitive data much less so. We consider a scenario inspired by future pharmacogenomics, in which a patient's genome data is stored over a long time period while random parts of it are periodically accessed by authorized parties such as doctors and clinicians. A protection scheme is described that preserves integrity of the genomic data in that scenario over a time horizon of 100 years. During such a long time period, cryptographic schemes will potentially break and therefore our scheme allows to update the integrity protection. Furthermore, integrity of parts of the genomic data can be verified without compromising the privacy of the remaining data. Finally, a performance evaluation and cost projection shows that privacy-preserving long-term integrity protection of genomic data is resource demanding, but in reach of current and future hardware technology and has negligible costs of storage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.