2020
DOI: 10.1016/j.fsigen.2020.102259
|View full text |Cite
|
Sign up to set email alerts
|

How to choose sets of ancestry informative markers: A supervised feature selection approach

Abstract: Highlights:• We provide AIMsetfinder, a tool to systematically select ancestry informative markers (AIMs).• Simulations of human population structure can be used to assess the performance of AIM selection procedures.• 17 SNPs identified by AIMsetfinder suffice to classify all african, european, east asian, and south asian individuals in the 1000 Genomes project correctly.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 27 publications
(34 citation statements)
references
References 48 publications
0
33
0
1
Order By: Relevance
“…The former comes with 128 SNPs, and we ignore seven tri-allelic SNPs (rs17287498, rs2069945, rs2184030, rs433342, rs4540055, rs5030240, rs12402499), since our methods currently rely on bi-allelic SNPs. It was designed to distinguish Africa, Europe, East Asia, Native America, and Oceania, but was shown to perform well on the 1000 genomes dataset, also for distinguishing South Asia, even when ignoring the tri-allelic SNPs [6]. The latter comes with 55 bi-allelic SNPs and was introduced as a global AIMset differentiating between 73 populations.…”
Section: Data From the 1000 Genomes Projectmentioning
confidence: 99%
See 4 more Smart Citations
“…The former comes with 128 SNPs, and we ignore seven tri-allelic SNPs (rs17287498, rs2069945, rs2184030, rs433342, rs4540055, rs5030240, rs12402499), since our methods currently rely on bi-allelic SNPs. It was designed to distinguish Africa, Europe, East Asia, Native America, and Oceania, but was shown to perform well on the 1000 genomes dataset, also for distinguishing South Asia, even when ignoring the tri-allelic SNPs [6]. The latter comes with 55 bi-allelic SNPs and was introduced as a global AIMset differentiating between 73 populations.…”
Section: Data From the 1000 Genomes Projectmentioning
confidence: 99%
“…More precisely, we simulate a sample of 400 individuals per island, each with 20 recombining chromosomes, each with about 2.5•10 4 SNPs. From these ∼ 5 • 10 5 SNPs, we use the step-wise approach from [6] to look for 10 Ancestry Informative Markers (AIMs). When using a naive Bayes approach as in SNIPPER [5], this AIMset gives a vanishing misclassification error for the task of classifying the 3×400 simulated, non-admixed individuals.…”
Section: Classification Fails Frequently For Recently Admixed Individmentioning
confidence: 99%
See 3 more Smart Citations