Adaptation to local environments often occurs through natural selection acting on a large number of loci, each having a weak phenotypic effect. One way to detect these loci is to identify genetic polymorphisms that exhibit high correlation with environmental variables used as proxies for ecological pressures. Here, we propose new algorithms based on population genetics, ecological modeling, and statistical learning techniques to screen genomes for signatures of local adaptation. Implemented in the computer program “latent factor mixed model” (LFMM), these algorithms employ an approach in which population structure is introduced using unobserved variables. These fast and computationally efficient algorithms detect correlations between environmental and genetic variation while simultaneously inferring background levels of population structure. Comparing these new algorithms with related methods provides evidence that LFMM can efficiently estimate random effects due to population history and isolation-by-distance patterns when computing gene-environment correlations, and decrease the number of false-positive associations in genome scans. We then apply these models to plant and human genetic data, identifying several genes with functions related to development that exhibit strong correlations with climatic gradients.
Inference of individual ancestry coefficients, which is important for population genetic and association studies, is commonly performed using computer-intensive likelihood algorithms. With the availability of large population genomic data sets, fast versions of likelihood algorithms have attracted considerable attention. Reducing the computational burden of estimation algorithms remains, however, a major challenge. Here, we present a fast and efficient method for estimating individual ancestry coefficients based on sparse nonnegative matrix factorization algorithms. We implemented our method in the computer program sNMF and applied it to human and plant data sets. The performances of sNMF were then compared to the likelihood algorithm implemented in the computer program ADMIXTURE. Without loss of accuracy, sNMF computed estimates of ancestry coefficients with runtimes 10-30 times shorter than those of ADMIXTURE. INFERENCE of population structure from multilocus genotype data is commonly performed using likelihood methods implemented in the computer programs STRUCTURE, FRAPPE, and ADMIXTURE (Pritchard et al. 2000a;Tang et al. 2005;Alexander et al. 2009). These programs compute probabilistic quantities called ancestry coefficients that represent the proportions of an individual genome that originate from multiple ancestral gene pools. Estimation of ancestry proportions is important in many respects, for example in delineating genetic clusters, drawing inference about the history of a species, screening genomes for signatures of natural selection, and performing statistical corrections in genome-wide association studies (Pritchard et al. 2000b;Marchini et al. 2004;Price et al. 2006;Frichot et al. 2013).Individual ancestry coefficients can be estimated using either supervised or unsupervised statistical methods. Supervised estimation methods use predefined source populations as ancestral populations. Classical supervised estimation approaches were based on least-squares regression of allele frequencies in hybrid and source populations (Roberts and Hiorns 1965;Cavalli-Sforza and Bodmer 1971). Unsupervised approaches attempt to infer ancestral gene pools from the data, using likelihood methods. An undesired feature of likelihood methods is that they can be computer intensive, with typical runs lasting several hours or more. With the use of dense genomic data and increased sample sizes, reducing the time lag necessary to perform estimation is a major challenge of population genetic data analysis.A fast approach to the estimation of ancestry coefficients is by using principal component analysis (PCA) . PCA is an exploratory method that describes high-dimensional data, using a small number of dimensions, and makes no assumptions about sampled and ancestral populations. Using PCA can lead to results surprisingly close to likelihood methods, and connections between methods have been intensively investigated during recent years Engelhardt and Stephens 2010;Frichot et al. 2012;. But a drawback of PCA is that interpr...
Smilei is a collaborative, open-source, object-oriented (C++) particle-in-cell code. To benefit from the latest advances in high-performance computing (HPC), Smilei is co-developed by both physicists and HPC experts. The code's structures, capabilities, parallelization strategy and performances are discussed. Additional modules (e.g. to treat ionization or collisions), benchmarks and physics highlights are also presented. Multi-purpose and evolutive, Smilei is applied today to a wide range of physics studies, from relativistic laser-plasma interaction to astrophysical plasmas.Nature of the problem: The kinetic simulation of plasmas is at the center of various physics studies, from laser-plasma interaction to astrophysics. To address today's challenges, a versatile simulation tool requires high-performance computing on massively parallel super-computers.Solution method: The Vlasov-Maxwell system describing the self-consistent evolution of a collisionless plasma is solved using the Particle-In-Cell (PIC) method. Additional physics modules allow to account for additional effects such as collisions and/or ionization. A hybrid MPI-OpenMP strategy, based on a patch-based superdecomposition, allows for efficient cache-use, dynamic load balancing and highperformance on massively parallel super-computers. 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.