2019
DOI: 10.1101/2019.12.11.872051
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting Geographic Location from Genetic Variation with Deep Neural Networks

Abstract: 6Most organisms are more closely related to nearby than distant members of their species, creating 7 spatial autocorrelations in genetic data. This allows us to predict the location of origin of a genetic 8 sample by comparing it to a set of samples of known geographic origin. Here we describe a deep 9 learning method, which we call Locator, to accomplish this task faster and more accurately than 10 existing approaches. In simulations, Locator infers sample location to within 4.1 generations of 11 dispersal an… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
43
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(46 citation statements)
references
References 61 publications
3
43
0
Order By: Relevance
“…The tarpan horse came about following admixture between horses native to Europe (modelled as having 28.8-34.2% and 32.2-33.2% CWC ancestry in Ori-entAGraph 19 and qpAdm 17 , respectively) and horses closely related to DOM2. This is consistent with LOCATOR 20 predicting ancestors in western Ukraine (Fig 3c) and refutes previous hypotheses depicting tarpans as the wild ancestor or a feral version of DOM2, or a hybrid with Przewalski's horses 34 .…”
Section: Articlesupporting
confidence: 91%
See 1 more Smart Citation
“…The tarpan horse came about following admixture between horses native to Europe (modelled as having 28.8-34.2% and 32.2-33.2% CWC ancestry in Ori-entAGraph 19 and qpAdm 17 , respectively) and horses closely related to DOM2. This is consistent with LOCATOR 20 predicting ancestors in western Ukraine (Fig 3c) and refutes previous hypotheses depicting tarpans as the wild ancestor or a feral version of DOM2, or a hybrid with Przewalski's horses 34 .…”
Section: Articlesupporting
confidence: 91%
“…This eliminates the possibility of DOM2 ancestors further west than C-PONT and the Dnieper steppes. Furthermore, patterns of spatial autocorrelations in the genetic data 20 indicated Western Eurasia steppes as the most likely geographic location of DOM2 ancestors (Fig. 3c).…”
Section: Articlementioning
confidence: 96%
“…Exploring more ways to extract information from genomic data using deep learning is likely to continue being an active area of research not only in terms of designing neural network architectures but also for thinking about how to represent genetic variation. The way that we represent genomic data as an image differs from other approaches that use the actual nucleotide alignment or genotype matrix as their input for CNN training (Battey et al, 2020;Flagel et al 2018;Suvorov et al 2019). We chose our data representation because we believed it to be an intuitive summary of pairwise coalescence times between species organized by the pattern of divergence in the underlying phylogeny.…”
Section: Limitations and Future Directionsmentioning
confidence: 99%
“…This could be used, for example, to ascertain the geographic origin of poached individuals, or to estimate post-natal dispersal. To this end, we used the novel, deep-learning method LOCATOR (Battey, Ralph, & Kern, 2020) to predict the geographic origin of samples without relying upon explicit assumptions about population genetic processes underlying spatial genetic differentiation (Bradburd & Ralph, 2019). This analysis was performed iteratively across each individual, using the remaining samples to train the LOCATOR classifier, with 100 bootstrap pseudo-replicates to assess variance in geolocation.…”
Section: Relative Dispersal By Age and Sexmentioning
confidence: 99%