Mesoamerica has an important role in the expansion of Paleoamericans as the route to South America. In this study, we determined complete mitogenome sequences of 113 unrelated individuals from two indigenous populations of Mesoamerica, Mazahua and Zapotec. All newly sequenced mitogenomes could be classified into haplogroups A2, B2, C1 and D1, but one sequence in Mazahua was D4h3a, a subclade of haplogroup D4. This haplogroup has been mostly found in South America along the Pacific coast. Haplogroup X2a was not found in either population. Genetic similarity obtained using phylogenetic tree construction and principal component analysis showed that these two populations are distantly related to each other. Actually, the Mazahua and the Zapotec shared no sequences (haplotypes) in common, while each also showed a number of unique subclades. Surprisingly, Zapotec formed a cluster with indigenous populations living in an area from central Mesoamerica to Central America. By contrast, the Mazahua formed a group with indigenous populations living in external areas, including southwestern North America and South America. This intriguing genetic relationship suggests the presence of two paleo-Mesoamerican groups, invoking a scenario in which one group had expanded into South America and the other resided in Mesoamerica.
It is considered that more than 15 depths of coverage are necessary for next-generation sequencing (NGS) data to obtain reliable complete nucleotide sequences of the mitogenome. However, it is difficult to satisfy this requirement for all nucleotide positions because of problems obtaining a uniform depth of coverage for poorly preserved materials. Thus, we propose an imputation approach that allows a complete mitogenome sequence to be deduced from low-depth-coverage NGS data. We used different types of mitogenome data files as panels for imputation: a worldwide panel comprising all the major haplogroups, a worldwide panel comprising sequences belonging to the estimated haplogroup alone, a panel comprising sequences from the population most closely related to an individual under investigation, and a panel comprising sequences belonging to the estimated haplogroup from the population most closely related to an individual under investigation. The number of missing nucleotides was drastically reduced in all the panels, but the contents obtained by imputation were quite different among the panels. The efficiency of the imputation method differed according to the panels used. The missing nucleotides were most credibly imputed using sequences of the estimated haplogroup from the population most closely related to the individual under investigation as a panel.
The incompleteness of partial human mitochondrial genome sequences makes it difficult to perform relevant comparisons among multiple resources. To deal with this issue, we propose a computational framework for deducing missing nucleotides in the human mitochondrial genome. We applied it to worldwide mitochondrial haplogroup lineages and assessed its performance. Our approach can deduce the missing nucleotides with a precision of 0.99 or higher in most human mitochondrial DNA lineages. Furthermore, although low-coverage mitochondrial genome sequences often lead to a blurred relationship in the multidimensional scaling analysis, our approach can correct this positional arrangement according to the corresponding mitochondrial DNA lineages. Therefore, our framework will provide a practical solution to compensate for the lack of genome coverage in partial and fragmented human mitochondrial genome sequences. In this study, we developed an open-source computer program, MitoIMP, implementing our imputation procedure. MitoIMP is freely available from https://github.com/omics-tools/mitoimp .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.