2021
DOI: 10.1101/2021.03.19.436173
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Population-specific genome graphs improve high-throughput sequencing data analysis: A case study on the Pan-African genome

Abstract: Graph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference for capturing the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. While there have been many efforts to develop computationally efficient graph-based bioinformatics toolkits, how to curate genomic variants and subsequently construct genome graphs remains an understudied… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 66 publications
(109 reference statements)
0
1
0
Order By: Relevance
“…A recent report suggests that using a pan-genome reference assembled from 910 subjects of African descent 55 and a graph-based genome alignment strategy can improve variant calling from the African population. 56 In summary, our data show that the intrinsic differences between the GRCh37 and GRCh38 references significantly impact variant calling for certain genomic regions including 206 genes (57 paralogous genes, 28 pseudogenes, 8 genes implicated in Mendelian diseases on OMIM, no ACMG genes). Only 3 known P/LP variants and 15 rare, putatively deleterious variants were discordantly called due to reference differences.…”
Section: Discussionmentioning
confidence: 82%
“…A recent report suggests that using a pan-genome reference assembled from 910 subjects of African descent 55 and a graph-based genome alignment strategy can improve variant calling from the African population. 56 In summary, our data show that the intrinsic differences between the GRCh37 and GRCh38 references significantly impact variant calling for certain genomic regions including 206 genes (57 paralogous genes, 28 pseudogenes, 8 genes implicated in Mendelian diseases on OMIM, no ACMG genes). Only 3 known P/LP variants and 15 rare, putatively deleterious variants were discordantly called due to reference differences.…”
Section: Discussionmentioning
confidence: 82%