We examine the distribution and structure of human genetic diversity for 710 individuals representing 31 populations from Africa, East Asia, Europe, and India using 100 Alu insertion polymorphisms from all 22 autosomes. Alu diversity is highest in Africans (0.349) and lowest in Europeans (0.297). Alu insertion frequency is lowest in Africans (0.463) and higher in Indians (0.544), E. Asians (0.557), and Europeans (0.559). Large genetic distances are observed among African populations and between African and non-African populations. The root of a neighbor-joining network is located closest to the African populations. These findings are consistent with an African origin of modern humans and with a bottleneck effect in the human populations that left Africa to colonize the rest of the world. Genetic distances among all pairs of populations show a significant product-moment correlation with geographic distances (r = 0.69, P < 0.00001). F ST , the proportion of genetic diversity attributable to population subdivision is 0.141 for Africans/E. Asians/Europeans, 0.047 for E. Asians/Indians/Europeans, and 0.090 for all 31 populations. Resampling analyses show that ∼50 Alu polymorphisms are sufficient to obtain accurate and reliable genetic distance estimates. These analyses also demonstrate that markers with higher F ST values have greater resolving power and produce more consistent genetic distance estimates.[Supplemental material is available online at www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper:
We have analyzed 35 widely distributed, polymorphic Alu loci in 715 individuals from 31 world populations. The average frequency of Alu insertions (the derived state) is lowest in Africa (.42) but is higher and similar in India (.55), Europe (.56), and Asia (.57). A comparison with 30 restriction-site polymorphisms (RSPs) for which the ancestral state has been determined shows that the frequency of derived RSP alleles is also lower in Africa (.35) than it is in Asia (.45) and in Europe (.46). Neighbor-joining networks based on Alu insertions or RSPs are rooted in Africa and show African populations as separate from other populations, with high statistical support. Correlations between genetic distances based on Alu and nuclear RSPs, short tandem-repeat polymorphisms, and mtDNA, in the same individuals, are high and significant. For the 35 loci, Alu gene diversity and the diversity attributable to population subdivision is highest in Africa but is lower and similar in Europe and Asia. The distribution of ancestral alleles is consistent with an origin of early modern human populations in sub-Saharan Africa, the isolation and preservation of ancestral alleles within Africa, and an expansion out of Africa into Eurasia. This expansion is characterized by increasing frequencies of Alu inserts and by derived RSP alleles with reduced genetic diversity in non-African populations.
Alu elements comprise >10% of the human genome. We have used a computational biology approach to analyze the human genomic DNA sequence databases to determine the impact of gene conversion on the sequence diversity of recently integrated Alu elements and to identify Alu elements that were potentially retroposition competent. We analyzed 269 Alu Ya5 elements and identified 23 members of a new Alu subfamily termed Ya5a2 with an estimated copy number of 35 members, including the de novo Alu insertion in the NF1 gene. Our analysis of Alu elements containing one to four (Ya1-Ya4) of the Ya5 subfamily-specific mutations suggests that gene conversion contributed as much as 10%-20% of the variation between recently integrated Alu elements. In addition, analysis of the middle A-rich region of the different Alu Ya5 members indicates a tendency toward expansion of this region and subsequent generation of simple sequence repeats. Mining the databases for putative retroposition-competent elements that share 100% nucleotide identity to the previously reported de novo Alu insertions linked to human diseases resulted in the retrieval of 13 exact matches to the NF1 Alu repeat, three to the Alu element in BRCA2, and one to the Alu element in FGFR2 (Apert syndrome). Transient transfections of the potential source gene for the Apert's Alu with its endogenous flanking genomic sequences demonstrated the transcriptional and presumptive transpositional competency of the element.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.