We have studied genetic variation at nine autosomal short tandem repeat loci in 20 globally distributed human populations defined by geographic and ethnic origins, viz., African, Caucasian, Asian, Native American and Oceanic. The purpose of this study is to evaluate the utility and applicability of these nine loci in forensic analysis in worldwide populations. The levels of genetic variation measured by number of alleles, allele size variance and heterozygosity are high in all populations irrespective of their effective sizes. Single-as well as multi-locus genotype frequencies are in conformity with the assumptions of Hardy-Weinberg equilibrium. Further, alleles across the entire set of nine loci are mutually independent in all populations. Gene diversity analysis shows that pooling of population data by major geographic groupings does not introduce substructure effects beyond the levels recommended by the National Research Council, validating the establishment of population databases based on major geographic and ethnic groupings. A network tree based on genetic distances further supports this assertion, in which populations of common ancestry cluster together. With respect to the power of discrimination and exclusion probabilities, even the relatively reduced levels of genetic variation at these nine STR loci in smaller and isolated populations provide an exclusionary power over 99%. However, in paternity testing with unknown genotype of the mother, the power of exclusion could fall below 80% in some isolated populations, and in such cases use of additional loci supplementing the battery of the nine loci is recommended. European Journal of Human Genetics (2003Genetics ( ) 11, 39 -49. doi:10.1038 Keywords: forensic genetics; population genetics; STR database; forensic markers; parentage testing; Hardy -Weinberg equilibrium
IntroductionThe remarkable progress made in DNA technology in the past decade has had an enormous impact on several disciplines, including forensic science. Identification of thousands of genetic markers, particularly the short tandem repeat (STR) loci, distributed throughout the human genome, and their analysis using polymerase chain reaction (PCR) based techniques, tremendously augmented the efficiency in individual identification and determination of genetic relationships among individuals. Based on population genetic characteristics desired in forensic analysis, such as adherence to the expectations of Hardy -Weinberg equilibrium (HWE) and independence of alleles across loci, (viz., D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D7S820, D16S539, CSF1P0, TPOX, TH01) have been established as the core genetic markers for use in DNA forensic analysis and parentage testing. 1,2 These developments together with the recommendations of the National Research Council (NRC) 3 with respect to statistical interpretation of DNA evidence, have been instrumental in the worldwide acceptance of DNA evidence in the criminal justice system. However, the databases on these 13 loci are largely...