HaploCart: Human mtDNA haplogroup classification using a pangenomic reference graph

Rubin, Joshua Daniel; Vogel, Nicola Alexandra; Gopalakrishnan, Shyam; Sackett, Peter Wad; Renaud, Gabriel

doi:10.1371/journal.pcbi.1011148

Cited by 6 publications

(1 citation statement)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A primary assignment of haplogroup names to haplotypes based on a set of mutations present in a given sequence. This step is commonly performed by haplogroup callers such as HaploGrep3 [11], HaploCart [13], and HaploGrouper [14].…”

Section: Text Box 2 -Mtdna Groupingsmentioning

confidence: 99%

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

Bajić,

Schulmann,

Nowick

2023

Preprint

View full text Add to dashboard Cite

Population-based studies of mitochondrial genetic diversity often require the classification of mitochondrial DNA (mtDNA) haplotypes into more than 2000 described haplogroups, and further grouping those into hierarchically higher haplogroups. Such secondary haplogroup groupings (e.g. “macro-haplogroups”) vary across studies, as they depend on the sample quality, technical factors of haplogroup calling, the aims of the study, and the researchers’ understanding of the mtDNA haplogroup nomenclature. Retention of historical nomenclature coupled with a growing number of newly described mtDNA lineages results in increasingly complex and inconsistent nomenclature that does not reflect phylogeny well. This “clutter” leaves room for grouping errors and inconsistencies between scientific publications, especially when the haplogroup names are used as a proxy for secondary groupings, and represents a source for scientific misinterpretation.Here we explore the effects of phylogenetically insensitive secondary mtDNA haplogroup groupings, and the lack of standardized secondary haplogroup groupings on downstream analyses and interpretation of genetic data. We demonstrate that frequency-based analyses produce inconsistent results when different secondary mtDNA groupings are applied, and thus allow for vastly different interpretations of the same genetic data. The lack of guidelines and recommendations on how to choose appropriate secondary haplogroup groupings presents an issue for the interpretation of results, as well as their comparison and reproducibility across studies.To reduce biases introduced by arbitrarily defined secondary nomenclature-based groupings, we suggest the implementation of phylogenetically meaningful algorithm-based groupings to define a standardized set of “macro-haplogroups”, “meso-haplogroups”, and “micro-haplogroups”. Such phylogenetically informative levels of haplogroup groupings can be easily implemented into haplogroup callers such asHaploGrep3. This would foster reproducibility across studies, provide a grouping standard for population-based studies, and reduce errors associated with haplogroup nomenclatures in future studies.

show abstract

Section: Text Box 2 -Mtdna Groupingsmentioning

confidence: 99%

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

Bajić,

Schulmann,

Nowick

2023

Preprint

View full text Add to dashboard Cite

show abstract

Identification of the 18 World War II executed citizens of Adele, Rethymnon, Crete using an ancient DNA approach and low coverage genomes

Psonis,

Vassou,

Nafplioti

et al. 2024

Forensic Science International: Genetics

View full text Add to dashboard Cite

`soibean`: High-Resolution Taxonomic Identification of Ancient Environmental DNA Using Mitochondrial Pangenome Graphs

Vogel,

Rubin,

Pedersen

et al. 2024

Molecular Biology and Evolution

View full text Add to dashboard Cite

Ancient environmental DNA (aeDNA) is becoming a powerful tool to gain insights about past ecosystems, overcoming the limitations of conventional fossil records. However, several methodological challenges remain, particularly for classifying the DNA to species level and conducting phylogenetic analysis. Current methods, primarily tailored for modern datasets, fail to capture several idiosyncrasies of aeDNA, including species mixtures from closely related species and ancestral divergence. We introduce soibean, a novel tool that utilises mitochondrial pangenomic graphs for identifying species from aeDNA reads. It outperforms existing methods in accurately identifying species from multiple closely related sources within a sample, enhancing phylogenetic analysis for aeDNA. soibean employs a damage-aware likelihood model for precise identification at low coverage with a high damage rate. Additionally, we reconstructed ancestral sequences for soibean's database to handle aeDNA that is highly diverged from modern references. soibean demonstrates effectiveness through simulated data tests and empirical validation. Notably, our method uncovered new empirical results in published datasets, including using porpoise whales as food in a Mesolithic community in Sweden, demonstrating its potential to reveal previously unrecognised findings in aeDNA studies.

show abstract

HaploCart: Human mtDNA haplogroup classification using a pangenomic reference graph

Cited by 6 publications

References 48 publications

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

Identification of the 18 World War II executed citizens of Adele, Rethymnon, Crete using an ancient DNA approach and low coverage genomes

`soibean`: High-Resolution Taxonomic Identification of Ancient Environmental DNA Using Mitochondrial Pangenome Graphs

Contact Info

Product

Resources

About

HaploCart: Human mtDNA haplogroup classification using a pangenomic reference graph

Cited by 6 publications

References 48 publications

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

mtDNA “Nomenclutter” and its Consequences on the Interpretation of Genetic Data

Identification of the 18 World War II executed citizens of Adele, Rethymnon, Crete using an ancient DNA approach and low coverage genomes

soibean: High-Resolution Taxonomic Identification of Ancient Environmental DNA Using Mitochondrial Pangenome Graphs

Contact Info

Product

Resources

About

`soibean`: High-Resolution Taxonomic Identification of Ancient Environmental DNA Using Mitochondrial Pangenome Graphs