BackgroundThe dramatic progress in sequencing technologies offers unprecedented prospects for deciphering the organization of natural populations in space and time. However, the size of the datasets generated also poses some daunting challenges. In particular, Bayesian clustering algorithms based on pre-defined population genetics models such as the STRUCTURE or BAPS software may not be able to cope with this unprecedented amount of data. Thus, there is a need for less computer-intensive approaches. Multivariate analyses seem particularly appealing as they are specifically devoted to extracting information from large datasets. Unfortunately, currently available multivariate methods still lack some essential features needed to study the genetic structure of natural populations.ResultsWe introduce the Discriminant Analysis of Principal Components (DAPC), a multivariate method designed to identify and describe clusters of genetically related individuals. When group priors are lacking, DAPC uses sequential K-means and model selection to infer genetic clusters. Our approach allows extracting rich information from genetic data, providing assignment of individuals to groups, a visual assessment of between-population differentiation, and contribution of individual alleles to population structuring. We evaluate the performance of our method using simulated data, which were also analyzed using STRUCTURE as a benchmark. Additionally, we illustrate the method by analyzing microsatellite polymorphism in worldwide human populations and hemagglutinin gene sequence variation in seasonal influenza.ConclusionsAnalysis of simulated data revealed that our approach performs generally better than STRUCTURE at characterizing population subdivision. The tools implemented in DAPC for the identification of clusters and graphical representation of between-group structures allow to unravel complex population structures. Our approach is also faster than Bayesian clustering algorithms by several orders of magnitude, and may be applicable to a wider range of datasets.
Increasing attention is being devoted to taking landscape information into account in genetic studies. Among landscape variables, space is often considered as one of the most important. To reveal spatial patterns, a statistical method should be spatially explicit, that is, it should directly take spatial information into account as a component of the adjusted model or of the optimized criterion. In this paper we propose a new spatially explicit multivariate method, spatial principal component analysis (sPCA), to investigate the spatial pattern of genetic variability using allelic frequency data of individuals or populations. This analysis does not require data to meet Hardy-Weinberg expectations or linkage equilibrium to exist between loci. The sPCA yields scores summarizing both the genetic variability and the spatial structure among individuals (or populations). Global structures (patches, clines and intermediates) are disentangled from local ones (strong genetic differences between neighbors) and from random noise. Two statistical tests are proposed to detect the existence of both types of patterns. As an illustration, the results of principal component analysis (PCA) and sPCA are compared using simulated datasets and real georeferenced microsatellite data of Scandinavian brown bear individuals (Ursus arctos). sPCA performed better than PCA to reveal spatial genetic patterns. The proposed methodology is implemented in the adegenet package of the free software R.
Oli and Dobson proposed that the ratio between the magnitude and the onset of reproduction (F/ alpha ratio) allows one to predict the relative importance of vital rates on population growth rate in mammalian populations and provides a reliable measure of the ranking of mammalian species on the slow-fast continuum of life-history tactics. We show that the choice of the ratio F/ alpha is arbitrary and is not grounded in demographic theory. We estimate the position on the slow-fast continuum using the first axis of a principal components analysis of all life-history variables studied by Oli and Dobson and show that most individual vital rates perform as well as the F/ alpha ratio. Finally, we find, in agreement with previous studies, that the age of first reproduction is a reliable predictor of the ranking of mammalian populations along the slow-fast continuum and that both body mass and phylogeny markedly influence the generation time of mammalian species. We conclude that arbitrary ratios such as F/ alpha correlate with life-history types in mammals simply because life-history variables are highly correlated in response to allometric, phylogenetic, and environmental influences. We suggest that generation time is a reliable metric to measure life-history variation among mammalian populations and should be preferred to any arbitrary combination between vital rates.
Background Toxoplasma gondii is found worldwide, but distribution of its genotypes as well as clinical expression of human toxoplasmosis varies across the continents. Several studies in Europe, North America and South America argued for a role of genotypes in the clinical expression of human toxoplasmosis. Genetic data concerning T. gondii isolates from Africa are scarce and not sufficient to investigate the population structure, a fundamental analysis for a better understanding of distribution, circulation, and transmission.Methodology/Principal FindingsSeropositive animals originating from urban and rural areas in Gabon were analyzed for T. gondii isolation and genotyping. Sixty-eight isolates, including one mixed infection (69 strains), were obtained by bioassay in mice. Genotyping was performed using length polymorphism of 13 microsatellite markers located on 10 different chromosomes. Results were analyzed in terms of population structure by Bayesian statistical modeling, Neighbor-joining trees reconstruction based on genetic distances, F ST and linkage disequilibrium. A moderate genetic diversity was detected. Three haplogroups and one single genotype clustered 27 genotypes. The majority of strains belonged to one haplogroup corresponding to the worldwide Type III. The remaining strains were distributed into two haplogroups (Africa 1 and 3) and one single genotype. Mouse virulence at isolation was significantly different between haplogroups. Africa 1 haplogroup was the most virulent.Conclusion Africa 1 and 3 haplogroups were proposed as being new major haplogroups of T. gondii circulating in Africa. A possible link with strains circulating in South and Central America is discussed. Analysis of population structure demonstrated a local spread within a rural area and strain circulation between the main cities of the country. This circulation, favored by human activity could lead to genetic exchanges. For the first time, key epidemiological questions were addressed for the West African T. gondii population, using the high discriminatory power of microsatellite markers, thus creating a basis for further epidemiological and clinical investigations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.