2019
DOI: 10.1007/s40484-019-0166-9
|View full text |Cite
|
Sign up to set email alerts
|

Current challenges and solutions of de novo assembly

Abstract: Background: Next-generation sequencing (NGS) technologies have fostered an unprecedented proliferation of highthroughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. However, numerous technical or computational challenges in de novo assembly still remain, although many new ideas and solutions have been suggested to tackle the challenges in both experimental and computational settings. Results: In this review, we first briefly introduce some of the major… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
55
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 60 publications
(55 citation statements)
references
References 143 publications
0
55
0
Order By: Relevance
“…First, as read extension performs pair-wise overlapping of clipped reads, the computational process is efficient and the resulted long sequences are accurate. Second, unlike assembly based methods that usually reply on reads with a high coverage and try to calculate a long consensus sequence representing that region ( 20 ), the read extension method faithfully keeps useful and informative reads and the extended sequences can be treated as a single long read to support an SV event. Third, spliced alignment can identify deletions and short insertions by aligning the clipped reads to the boundary sequences of SV events.…”
Section: Resultsmentioning
confidence: 99%
“…First, as read extension performs pair-wise overlapping of clipped reads, the computational process is efficient and the resulted long sequences are accurate. Second, unlike assembly based methods that usually reply on reads with a high coverage and try to calculate a long consensus sequence representing that region ( 20 ), the read extension method faithfully keeps useful and informative reads and the extended sequences can be treated as a single long read to support an SV event. Third, spliced alignment can identify deletions and short insertions by aligning the clipped reads to the boundary sequences of SV events.…”
Section: Resultsmentioning
confidence: 99%
“…CIAlign was applied to the Example 2 alignment with the following options: In order to demonstrate the use of CIAlign on real biological sequences, an alignment was generated based on the COI gene commonly used in phylogenetic analysis and DNA barcoding [30]. As CIAlign addresses some common problems encountered when generating an MSA based on de novo assembled transcripts, which tend to have a higher error rates at transcript ends, gaps due to difficult to assemble regions and divergent sequences due to chimeric connections between unrelated regions [11,32], COI-like transcripts were identified by searching the NCBI transcriptome shotgun assembly database. Aligning these transcripts demonstrated several common problems -multiple insertions, poor alignment at the starts and ends of sequences, and a few divergent sequences resulting in excessive gaps (Fig 5A).…”
Section: Resultsmentioning
confidence: 99%
“…The quality of a genome assembly provides a measure of the degree to which the sequence has been correctly assembled and the sequences are reliable, and thus of great importance. Assembly quality can be assessed using different statistics, which offer a measure of genome completeness and contiguity ( Yandell and Ence, 2012 ; Lachance and Tishkoff, 2013 ; Simao et al, 2015 ; Liao, 2019 ). Excellent reviews are available on de novo genome assembly ( Liao, 2019 ) and use of genome sequencing in non-model organisms ( Ellegren, 2014 ).…”
Section: Genomes and Transcriptomes Can Assist Predicting Genetic Incmentioning
confidence: 99%