2022
DOI: 10.1038/s41467-022-31724-3
|View full text |Cite
|
Sign up to set email alerts
|

Pan-African genome demonstrates how population-specific genome graphs improve high-throughput sequencing data analysis

Abstract: Graph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference to represent the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. While there have been many efforts to develop computationally efficient graph-based toolkits for NGS read alignment and variant calling, methods to curate genomic variants and subsequently construct genome… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 52 publications
0
9
0
Order By: Relevance
“…Approximately one-third of low-mapping-quality reads could be successfully mapped to the NRSs with high mapping quality, indicating the presence of numerous suboptimal alignments when relying solely on the reference genome. Suboptimal alignments can be attributed to the frequent presence of false-positive alignments in BWA when default parameters are used [ 16 ]. The mapping procedure based on the graph genome is known to be time-consuming and memory-intensive [ 3 ].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Approximately one-third of low-mapping-quality reads could be successfully mapped to the NRSs with high mapping quality, indicating the presence of numerous suboptimal alignments when relying solely on the reference genome. Suboptimal alignments can be attributed to the frequent presence of false-positive alignments in BWA when default parameters are used [ 16 ]. The mapping procedure based on the graph genome is known to be time-consuming and memory-intensive [ 3 ].…”
Section: Discussionmentioning
confidence: 99%
“…The recent development of graph-based genome structures has enabled the construction of pangenome graphs [ 12 , 13 ]. Compared to a linear reference genome, the genome graph is capable of representing the genetic information from diverse breeds within a species, thereby reducing mapping bias and increasing sensitivity in detecting variants, particularly structural variants [ 14 16 ]. Typically, a pangenome graph establishes the reference genome as the backbone, thus preserving the coordinate system of the reference genome.…”
Section: Introductionmentioning
confidence: 99%
“…This is with the hope that such will allow for more utility and compatibility of the outcomes of whole genome sequencing in such populations and reduce biases. For instance, results from the 1,000 Genomes dataset reveal that the African super-population genomic data has the largest divergence from the GRCh38 reference ( Tetikol et al, 2022 ). In fact, Sherman et al (2019) also observed that the African pan-genome used in their study contains ∼10% more DNA than the current human reference genome.…”
Section: Challenges In the Application Of Personalized Medicine In Af...mentioning
confidence: 99%
“…This lack of representation further extends to the human reference genome, from which the identification of novel or known genetic variants from NGS data is largely dependent [76]. The underrepresentation of African populations may thus exclude them from understanding disease aetiology as well as the detection and diagnosis of disease, as seen in MSMD [77].…”
Section: Under-representation In the Human Reference Genomementioning
confidence: 99%
“…The underrepresentation of African populations may thus exclude them from understanding disease aetiology as well as the detection and diagnosis of disease, as seen in MSMD [77]. One of the many implications of excluding African populations is the efficacy of medications-cures that are effective in certain populations are ineffectual in others [45]; therefore, additional sequencing efforts from diverse African populations are required to contribute to large-scale publicly available datasets and facilitate the construction of African-specific reference genomes in order better characterise the spectrum of variation in humans [71,74,76]. African genetic diversity may give insight to elucidate novel disease susceptibility, which increases the possibility of correct diagnosis, with the significant potential to inform clinical care [73].…”
Section: Under-representation In the Human Reference Genomementioning
confidence: 99%