2017
DOI: 10.1101/229070
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Localization of adaptive variants in human genomes using averaged one-dependence estimation

Abstract: Statistical methods for identifying adaptive mutations from population-genetic data face several obstacles: assessing the significance of genomic outliers, integrating correlated measures of selection into one analytic framework, and distinguishing adaptive variants from hitchhiking neutral variants. Here, we introduce SWIF(r), a probabilistic method that detects selective sweeps by learning the distributions of multiple selection statistics under different evolutionary scenarios and calculating the posterior … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
34
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(34 citation statements)
references
References 142 publications
0
34
0
Order By: Relevance
“…Our model, similar to others (e.g., Lin et al, 2011;Schrider and Kern, 2016b;Sugden et al, 2018), not only assigns class labels, but also provides a probability for each of the K classes. A properly-calibrated classifier should be one in which the probability of observing a given class is the actual fraction of times that the classifier chooses this class.…”
Section: Calibrating Class Probabilitiesmentioning
confidence: 99%
See 1 more Smart Citation
“…Our model, similar to others (e.g., Lin et al, 2011;Schrider and Kern, 2016b;Sugden et al, 2018), not only assigns class labels, but also provides a probability for each of the K classes. A properly-calibrated classifier should be one in which the probability of observing a given class is the actual fraction of times that the classifier chooses this class.…”
Section: Calibrating Class Probabilitiesmentioning
confidence: 99%
“…Several methods have recently been developed that incorporate information from multiple summary statistics to locate positively-selected genomic regions (Lin et al, 2011;Ronen et al, 2013;Pybus et al, 2015;Schrider and Kern, 2016b;Sheehan and Song, 2016;Kern and Schrider, 2018;Sugden et al, 2018). Most existing supervised learning approaches for detecting sweeps use combinations of summary statistics calculated in genomic windows of simulated chromosomes to train classifiers using methods such as support vector machines, random forests, neural networks, and boosting.…”
Section: Introductionmentioning
confidence: 99%
“…[9][10][11]). 36 The hitch-hiking effect provides a key signature of selection in modern datasets [12]. [12,13].…”
Section: Introduction 29mentioning
confidence: 99%
“…However, a major disadvantage of 92 such approaches is that the amount of simulation necessary to obtain an accurate estimate grows 93 dramatically with the dimensionality of the observed data (for a discussion, see e.g. [36]); similar 94 issues arise in the process of training machine learning methods (e.g. [22]), requiring considerations 95 to prevent overfitting and avoid excessive simulation.…”
Section: Introduction 29mentioning
confidence: 99%
“…Instead, we recommend that the reader take our chi-squared-distributed P-values with a grain of salt, and merely use them as a way to prioritize regions for more extensive downstream modeling and validation (for example, using methods like those described in refs. [51][52][53]).…”
Section: Selection Of Candidate Regionsmentioning
confidence: 99%