2017
DOI: 10.1093/bib/bbx037
|View full text |Cite
|
Sign up to set email alerts
|

Comparative analysis of de novo assemblers for variation discovery in personal genomes

Abstract: Current variant discovery approaches often rely on an initial read mapping to the reference sequence. Their effectiveness is limited by the presence of gaps, potential misassemblies, regions of duplicates with a high-sequence similarity and regions of high-sequence divergence in the reference. Also, mapping-based approaches are less sensitive to large INDELs and complex variations and provide little phase information in personal genomes. A few de novo assemblers have been developed to identify variants through… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 58 publications
0
14
0
Order By: Relevance
“…Cortex uses a colored de Bruijn graph (see Table 2 for definition) to simultaneously infer SVs and complex combinations of SNVs, indels, and rearrangements [31]. SGVar [32] is a more recent string graph-based (see Table 2 for definition) de novo assembly pipeline based on the SGA assembler [75] that also uses short-read sequencing data. SGVar uses a stringent read preprocessing based on the read length and read quality.…”
Section: De Novo Assembly-based Approachmentioning
confidence: 99%
See 2 more Smart Citations
“…Cortex uses a colored de Bruijn graph (see Table 2 for definition) to simultaneously infer SVs and complex combinations of SNVs, indels, and rearrangements [31]. SGVar [32] is a more recent string graph-based (see Table 2 for definition) de novo assembly pipeline based on the SGA assembler [75] that also uses short-read sequencing data. SGVar uses a stringent read preprocessing based on the read length and read quality.…”
Section: De Novo Assembly-based Approachmentioning
confidence: 99%
“…It requires a perfect match to merge reads or sequences, which improves the assembly quality. Using both simulated and real data (chromosome six of the human genome), SGVar has been shown to outperform other methods, such as Cortex, for insertion and deletion identification [32].…”
Section: De Novo Assembly-based Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…Their effectiveness is limited by the presence of gaps, potential misassemblies, regions of duplicates with a high-sequence similarity and regions of high-sequence divergence in the reference. Also, mapping-based approaches are less sensitive to large INDELs and complex variations" (1) "We document that 18.6% of SNP genotype calls in HLA genes are incorrect and that allele frequencies are estimated with an error greater than ±0.1 at approximately 25% of the SNPs in HLA genes. We found a bias toward overestimation of reference allele frequency for the 1000G data, indicating mapping bias is an important cause of error in frequency estimation in this dataset."…”
Section: Recreate the Genome Using Prior Knowledge With Reference Basmentioning
confidence: 97%
“…(McKenna et al, 2010;Rimmer et al, 2014). Conversely, recovery of large-scale structural variants is typically achieved by first assembling individual genomes, which are then combined into a whole genome alignment (WGA) (Tian et al, 2018). The WGA enables the accurate location of large indels (typically larger than 3 kb) (Nattestad and Schatz, 2016;Tian et al, 2018).…”
Section: Introductionmentioning
confidence: 99%