Semi‐permeable barriers to geneflow in principle allow distantly related organisms to capture and exchange pre‐adapted genes potentially speeding adaptation. However, describing barriers to geneflow on a genomic scale is non‐trivial. We extend classic diagnostic allele counting measures of geneflow across a barrier to the case of genome‐scale data. Diagnostic index expectation maximisation (diem) polarises the labelling of bistate markers with respect to the sides of a barrier. An initial state of ignorance is enforced by starting with randomly generated marker polarisations. This means there is no prior on population or taxon membership of the genomes concerned. Using a deterministic data labelling, small numbers of classic diagnostic markers can be replaced by large numbers of markers, each with a diagnostic index. Individuals' hybrid indices (genome admixture proportions) are then calculated genome wide conditioned on marker diagnosticity; within diploid, haplodiploid and/or haploid genome compartments; or indeed over any subset of markers, allowing classical cline width/barrier strength comparisons along genomes. Along‐genome barrier strength heterogeneity allows for barrier regions to be identified. Furthermore, blocks of genetic material that have introgressed across a barrier are easily identified with high power. diem indicates panmixis among Myotis myotis bat genomes, with a barrier separating low data quality outliers. In a Mus musculus domesticus/Mus spretus system, diem adds multiple introgressions of olfactory (and vomeronasal) gene clusters in one direction to previous demonstrations of a pesticide resistance gene introgressing in the opposite direction across a strong species barrier. diem is a genome analysis solution, which scales over reduced representation genomics of thousands of markers to treatment of all variant sites in large genomes. While the method lends itself to visualisation, its output of markers with barrier‐informative annotation will fuel research in population genetics, phylogenetics and association studies. diem can equip such downstream applications with millions of informative markers.
We extend classic allele counting measures of geneflow to the case of genome-scale data. Using a deterministic data labelling, small numbers of diagnostic markers can be replaced by large numbers of markers, each with a diagnostic index. Individuals' hybrid indices can then be calculated genome wide conditioned on marker diagnosticity; within diploid, haplodiploid and/or haploid genome compartments; or indeed over any subset of markers, allowing standard cline width/barrier strength comparisons. Diagnostic index expectation maximisation (diem) estimates bistate marker labelling polarities with respect to the sides of a barrier, also returning polarity support and a likelihood-based diagnostic index. Polarised markers allow detection of geneflow at the scale of whole genomes. The diem method is implemented in Mathematica and R. The Mathematica code is available at github through https://github.com/StuartJEBaird, and the R package diemr is available at CRAN through https://CRAN.R-project.org/package=diemr .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.