2023
DOI: 10.1093/bioinformatics/btad274
|View full text |Cite
|
Sign up to set email alerts
|

Privacy preserving identification of population stratification for collaborative genomic research

Abstract: The rapid improvements in genomic sequencing technology have led to the proliferation of locally collected genomic datasets. Given the sensitivity of genomic data, it is crucial to conduct collaborative studies while preserving the privacy of the individuals. However, before starting any collaborative research effort, the quality of the data needs to be assessed. One of the essential steps of the quality control process is population stratification: identifying the presence of genetic difference in individuals… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 22 publications
0
5
0
Order By: Relevance
“…For solutions based on differential privacy, it's important to note that even though differential privacy provides high robustness against membership inference attacks [ 44 ]; the most well-known attacks to the GWAS computations [ 45 ]; however, the added noise, necessary to ensure the privacy, comes with a significant reduction in the performances [ 84 ].…”
Section: Discussion Challenges Visionmentioning
confidence: 99%
See 1 more Smart Citation
“…For solutions based on differential privacy, it's important to note that even though differential privacy provides high robustness against membership inference attacks [ 44 ]; the most well-known attacks to the GWAS computations [ 45 ]; however, the added noise, necessary to ensure the privacy, comes with a significant reduction in the performances [ 84 ].…”
Section: Discussion Challenges Visionmentioning
confidence: 99%
“…Therefore, Dervishi et al . [ 45 ] focus on performing population stratification analysis by assigning individuals to corresponding populations in a collaborative study manner while maintaining privacy and thususing PCA. The aim is to correctly classify the samples (individuals) that are present in researchers’s local datasets to the corresponding population cluster based on the data sent by the researcher.…”
Section: Software-based Solutionsmentioning
confidence: 99%
“…PADI : Within the “privacy, accountability and data integrity” thrust (PADI), early results include novel privacy‐preserving techniques applied to sequential data sharing (Jiang, Yilmaz, and Ayday 2023), the sharing of summary statistics from sensitive databases, and collaborative quality control for research databases (Dervishi et al. 2023). In ongoing research, we are leveraging model cards and model ontologies for accountability AI models and their use.…”
Section: Highlighted Accomplishmentsmentioning
confidence: 99%
“…Wang et al (Wang et al 2022) proposed a homomorphic encryption method for identifying genetic relationships across parties, but their approach requires kinship computation for all pairs of samples, which does not scale to large datasets. Other previous approaches(Dervishi et al 2023;Glusman et al 2017;Hormozdiari et al 2014;Robinson and Glusman 2018) rely on sharing a limited amount of processed data between parties to find related samples, which sacrifices both privacy and accuracy to some extent. For instance, Dervishi et al(Dervishi et al 2023) introduced a solution in which the parties reveal a subset of SNPs in a shuffled order for their respective samples to estimate the kinship coefficients.…”
mentioning
confidence: 99%
“…Other previous approaches(Dervishi et al 2023;Glusman et al 2017;Hormozdiari et al 2014;Robinson and Glusman 2018) rely on sharing a limited amount of processed data between parties to find related samples, which sacrifices both privacy and accuracy to some extent. For instance, Dervishi et al(Dervishi et al 2023) introduced a solution in which the parties reveal a subset of SNPs in a shuffled order for their respective samples to estimate the kinship coefficients. Robinson andGlusman (2018) andGlusman et al (2017) proposed to compare "fingerprints" obtained by applying a random projection to genomic samples to infer relatedness He et al (2014).…”
mentioning
confidence: 99%