To provide a resource for assessing continental ancestry in a wide variety of genetic studies we identified, validated and characterized a set of 128 ancestry informative markers (AIMs). The markers were chosen for informativeness, genome-wide distribution, and genotype reproducibility on two platforms (TaqMan® assays and Illumina arrays). We analyzed genotyping data from 825 subjects with diverse ancestry, including European, East Asian, Amerindian, African, South Asian, Mexican, and Puerto Rican. A comprehensive set of 128 AIMs and subsets as small as 24 AIMs are shown to be useful tools for ascertaining the origin of subjects from particular continents, and to correct for population stratification in admixed population sample sets. Our findings provide general guidelines for the application of specific AIM subsets as a resource for wide application. We conclude that investigators can use TaqMan assays for the selected AIMs as a simple and cost efficient tool to control for differences in continental ancestry when conducting association studies in ethnically diverse populations.
Using a genome-wide single nucleotide polymorphism (SNP) panel, we observed population structure in a diverse group of Europeans and European Americans. Under a variety of conditions and tests, there is a consistent and reproducible distinction between “northern” and “southern” European population groups: most individual participants with southern European ancestry (Italian, Spanish, Portuguese, and Greek) have >85% membership in the “southern” population; and most northern, western, eastern, and central Europeans have >90% in the “northern” population group. Ashkenazi Jewish as well as Sephardic Jewish origin also showed >85% membership in the “southern” population, consistent with a later Mediterranean origin of these ethnic groups. Based on this work, we have developed a core set of informative SNP markers that can control for this partition in European population structure in a variety of clinical and genetic studies.
Background: Case-control genetic studies of complex human diseases can be confounded by population stratification. This issue can be addressed using panels of ancestry informative markers (AIMs) that can provide substantial population substructure information. Previously, we described a panel of 128 SNP AIMs that were designed as a tool for ascertaining the origins of subjects from Europe, Sub-Saharan Africa, Americas, and East Asia.
For admixture mapping studies in Mexican Americans (MAM), we define a genomewide single-nucleotide-polymorphism (SNP) panel that can distinguish between chromosomal segments of Amerindian (AMI) or European (EUR) ancestry. These studies used genotypes for >400,000 SNPs, defined in EUR and both Pima and Mayan AMI, to define a set of ancestry-informative markers (AIMs). The use of two AMI populations was necessary to remove a subset of SNPs that distinguished genotypes of only one AMI subgroup from EUR genotypes. The AIMs set contained 8,144 SNPs separated by a minimum of 50 kb with only three intermarker intervals >1 Mb and had EUR/AMI FST values >0.30 (mean FST = 0.48) and Mayan/Pima FST values <0.05 (mean FST < 0.01). Analysis of a subset of these SNP AIMs suggested that this panel may also distinguish ancestry between EUR and other disparate AMI groups, including Quechuan from South America. We show, using realistic simulation parameters that are based on our analyses of MAM genotyping results, that this panel of SNP AIMs provides good power for detecting disease-associated chromosomal segments for genes with modest ethnicity risk ratios. A reduced set of 5,287 SNP AIMs captured almost the same admixture mapping information, but smaller SNP sets showed substantial drop-off in admixture mapping information and power. The results will enable studies of type 2 diabetes, rheumatoid arthritis, and other diseases among which epidemiological studies suggest differences in the distribution of ancestry-associated susceptibility.
We and others have identified several hundred ancestry informative markers (AIMs) with large allele frequency differences between different major ancestral groups. For this study, a panel of 199 widely distributed AIMs was used to examine a diverse set of 796 DNA samples including self-identified European Americans, West Africans, East Asians, Amerindians, African Americans, Mexicans, Mexican Americans, Puerto Ricans and South Asians. Analysis using a Bayesian clustering algorithm (STRUCTURE) showed grouping of individuals with similar ethnic identity without any identifier other than the AIMs genotyping and showed admixture proportions that clearly distinguished different individuals of mixed ancestry. Additional analyses showed that, for the majority of samples, the predicted ethnic identity corresponded with the self-identified ethnicity at high probability (P > 0.99). Overall, the study demonstrates that AIMs can provide a useful adjunct to forensic medicine, pharmacogenomics and disease studies in which major ancestry or ethnic affiliation might be linked to specific outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.