2016
DOI: 10.1371/journal.pcbi.1004845
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Population Genetic Inference

Abstract: Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

8
271
0
2

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 253 publications
(281 citation statements)
references
References 67 publications
8
271
0
2
Order By: Relevance
“…Drosophila demographics is a potential compounding factor because not only derived fruit fly populations have been associated with severe bottlenecks, 47 but the ancestral range population in Sub-Saharan Africa is also predicted to have undergone a significant bottleneck. 32 These past events should reduce the magnitude of the effective population size in our framework; our predictions of θ (and using mutation rates per nucleotide from previous mutation-accumulation studies 42,43 ) yield N eff ≈ 10 6 , reasonably consistent with the population size estimates from Ref. 32 In summary, we have developed an ABC inference framework for simultaneous genome-wide prediction of selection strengths, mutation rates, and the fraction of viable alleles.…”
Section: Discussionsupporting
confidence: 70%
See 2 more Smart Citations
“…Drosophila demographics is a potential compounding factor because not only derived fruit fly populations have been associated with severe bottlenecks, 47 but the ancestral range population in Sub-Saharan Africa is also predicted to have undergone a significant bottleneck. 32 These past events should reduce the magnitude of the effective population size in our framework; our predictions of θ (and using mutation rates per nucleotide from previous mutation-accumulation studies 42,43 ) yield N eff ≈ 10 6 , reasonably consistent with the population size estimates from Ref. 32 In summary, we have developed an ABC inference framework for simultaneous genome-wide prediction of selection strengths, mutation rates, and the fraction of viable alleles.…”
Section: Discussionsupporting
confidence: 70%
“…We conclude that the ABC inference procedure applied to D. melanogaster genomic data has sufficient internal consistency for at least qualitative conclusions regarding the magnitude of mutation and selection forces. This observation does not however preclude the possibility that our results are affected by the phenomena that are not explicitly included into the ABC model formulated above, such as recombination, [36][37][38][39][40][41] demographic effects, 32,47 and the assumption that any sequence in the 100-bp window can mutate into every other sequence. 21…”
Section: Evolutionary Parameter Inference In D Melanogastermentioning
confidence: 99%
See 1 more Smart Citation
“…For example, deep learning has been utilized recently to jointly infer population size changes and selection in Drosophila [63] and can be similarly applied to data from human populations. Overall, with the combination of emerging novel methods and high-quality large-scale genetic sequencing, we expect a much more refined and accurate picture of population size history to be inferred in the near future for many more populations, and with the ability to focus on more and more recent history and to account for the X chromosome, admixture events, and joint-analysis of many populations.…”
Section: Discussionmentioning
confidence: 99%
“…In genomics, they have been applied to predict the expression of all genes from a carefully selected subset of landmark genes [22], predict enhancers, [23] and to distinguish active enhancers and promoters from background sequences [24]. An early study also applied an architecture with three hidden layers and 60 neurons to estimate historical effective population size and selection for a genomic segment with reasonable results [25]. However, carefully chosen summary statistics were used as input, so there were limited gains from the traditional benefit of a network being able to figure out relevant features from raw data.…”
Section: New Applications To Functional Genomics Datamentioning
confidence: 99%