The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1,000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between sub-continental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is in MXL, in CLM, and in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern America ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas thousand years ago (kya), supports that the MXL Ancestors split kya, with a subsequent split of the ancestors to CLM and PUR kya. The model also features effective populations of in Mexico, in Colombia, and in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.
Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2>0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.
The population of Argentina is the result of the intermixing between several groups, including Indigenous American, European and African populations. Despite the commonly held idea that the population of Argentina is of mostly European origin, multiple studies have shown that this process of admixture had an impact in the entire Argentine population. In the present study we characterized the distribution of Indigenous American, European and African ancestry among individuals from different regions of Argentina and evaluated the level of discrepancy between self-reported grandparental origin and genetic ancestry estimates. A set of 99 autosomal ancestry informative markers (AIMs) was genotyped in a sample of 441 Argentine individuals to estimate genetic ancestry. We used non-parametric tests to evaluate statistical significance. The average ancestry for the Argentine sample overall was 65% European (95%CI: 63–68%), 31% Indigenous American (28–33%) and 4% African (3–4%). We observed statistically significant differences in European ancestry across Argentine regions [Buenos Aires province (BA) 76%, 95%CI: 73–79%; Northeast (NEA) 54%, 95%CI: 49–58%; Northwest (NWA) 33%, 95%CI: 21–41%; South 54%, 95%CI: 49–59%; p<0.0001] as well as between the capital and immediate suburbs of Buenos Aires city compared to more distant suburbs [80% (95%CI: 75–86%) versus 68% (95%CI: 58–77%), p = 0.01]. European ancestry among individuals that declared all grandparents born in Europe was 91% (95%CI: 88–94%) compared to 54% (95%CI: 51–57%) among those with no European grandparents (p<0.001). Our results demonstrate the range of variation in genetic ancestry among Argentine individuals from different regions in the country, highlighting the importance of taking this variation into account in genetic association and admixture mapping studies in this population.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.