2015
DOI: 10.1038/srep10814
|View full text |Cite
|
Sign up to set email alerts
|

Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the “Finished” C. elegans Genome

Abstract: Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects usin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
57
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 57 publications
(57 citation statements)
references
References 30 publications
0
57
0
Order By: Relevance
“…We previously demonstrated that Illumina synthetic long reads not only are able to recover nonrepetitive sequences but also are capable of recovering most types of repetitive sequence except for those arranged in a long stretch of tandem repeats (Li et al 2015). To facilitate comparative analysis of expression of small RNAs, TEs, and protein-coding genes between wildtype and hybrid strains, we produced approximately 20× coverage of Illumina synthetic long reads for C. nigoni as described previously for a C. elegans genome assembly (Li et al 2015). Most of the reads are ∼10 kbp in length with a minimum size of 1.5 kbp.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We previously demonstrated that Illumina synthetic long reads not only are able to recover nonrepetitive sequences but also are capable of recovering most types of repetitive sequence except for those arranged in a long stretch of tandem repeats (Li et al 2015). To facilitate comparative analysis of expression of small RNAs, TEs, and protein-coding genes between wildtype and hybrid strains, we produced approximately 20× coverage of Illumina synthetic long reads for C. nigoni as described previously for a C. elegans genome assembly (Li et al 2015). Most of the reads are ∼10 kbp in length with a minimum size of 1.5 kbp.…”
Section: Resultsmentioning
confidence: 99%
“…In particular, the short reads are problematic for assembling contigs that are rich in repetitive sequences, which will prevent accurate annotation of these sequences, including TEs and some small RNAs that are to be addressed in this study. We previously demonstrated that Illumina synthetic long reads not only are able to recover nonrepetitive sequences but also are capable of recovering most types of repetitive sequence except for those arranged in a long stretch of tandem repeats (Li et al 2015). To facilitate comparative analysis of expression of small RNAs, TEs, and protein-coding genes between wildtype and hybrid strains, we produced approximately 20× coverage of Illumina synthetic long reads for C. nigoni as described previously for a C. elegans genome assembly (Li et al 2015).…”
Section: Resultsmentioning
confidence: 99%
“…This was due to more accurate assembly of large repetitive elements in the C. elegans genome, resulting in an improved genomic reference. It has recently been reported that the extent of some repetitive elements in the reference genome are truncated (Li et al 2015). The MinION-generated genome presented here expands repetitive elements throughout the genome, adding more than 2 Mb to the reference genome.…”
Section: Minion Sequence Improves the C Elegans Reference Genome By mentioning
confidence: 98%
“…These assemblies are significantly more contiguous than those generated by Illumina Synthetic Long-Read Sequencing or by PacBio sequencing for the C. elegans genome. MinION-derived assemblies resulted in an assembly with an N50 contig size of 3.99 Mb compared to an N50 of 86 kb for Synthetic Long-Read Sequencing (Li et al 2015) and an N50 of 1.6 Mb for PacBio sequencing (https://github.com/PacificBiosciences/DevNet/wiki/ C.-elegans-data-set). Combining flow cell data improved the assemblies.…”
Section: Assemblies Are Most Impacted By Read Lengthmentioning
confidence: 99%
See 1 more Smart Citation