2018
DOI: 10.1126/science.aau4832
|View full text |Cite
|
Sign up to set email alerts
|

Identity inference of genomic data using long-range familial searches

Abstract: Consumer genomics databases have reached the scale of millions of individuals. Recently, law enforcement authorities have exploited some of these databases to identify suspects via distant familial relatives. Using genomic data of 1.28 million individuals tested with consumer genomics, we investigated the power of this technique. We project that about 60% of the searches for individuals of European descent will result in a third-cousin or closer match, which theoretically allows their identification using demo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

7
234
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 267 publications
(241 citation statements)
references
References 22 publications
7
234
0
Order By: Relevance
“…This material was originally posted on the Coop lab site on May 7th, 2018, soon after the reporting of the arrest of Joseph DeAngelo in the Golden State Killer case, one of the first high-profile uses of long-range familial search. Subsequently, Erlich et al (2018) published a detailed analysis in a large empirical dataset along with a theoretical analysis of a model similar to the one we use here, obtaining results broadly consistent with the ones presented here. Because Erlich and colleagues kindly cited this work when describing their model, we thought it would be appropriate to post this material in a venue where it is more easily cited.…”
Section: Notesupporting
confidence: 79%
“…This material was originally posted on the Coop lab site on May 7th, 2018, soon after the reporting of the arrest of Joseph DeAngelo in the Golden State Killer case, one of the first high-profile uses of long-range familial search. Subsequently, Erlich et al (2018) published a detailed analysis in a large empirical dataset along with a theoretical analysis of a model similar to the one we use here, obtaining results broadly consistent with the ones presented here. Because Erlich and colleagues kindly cited this work when describing their model, we thought it would be appropriate to post this material in a venue where it is more easily cited.…”
Section: Notesupporting
confidence: 79%
“…These entities generally offer some subset of their services at no charge to uploaders, which helps to grow their databases. Upload services have also been used by law enforcement, with the goal of identifying relatives of the source of a crime-scene sample (Erlich et al, 2018; Edge and Coop, 2019), prompting discussion about genetic privacy (Court, 2018; Ram et al, 2018; Kennett, 2019; Scudder et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Previously the concerns have been focused on the potential discrimination by employers and health insurance companies against carriers of certain mutations 13,14 . Recently, a new concern 15 has been raised regarding law enforcement's access to biobank scale genomic databases for solving criminal cases by connecting remote relatives genetically. Our study demonstrated that those two concerns are indeed two faces of the same coin as one's genome can be largely reconstructed by the genomes of his/her genetic cousins traditionally thought as "unrelated".…”
Section: Discussionmentioning
confidence: 99%