2019
DOI: 10.1186/s12859-018-2570-y
|View full text |Cite
|
Sign up to set email alerts
|

Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)

Abstract: BackgroundDe novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of inpu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
48
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 45 publications
(50 citation statements)
references
References 33 publications
2
48
0
Order By: Relevance
“…menziesii (Mirb.) Franco) [38], and Siberian larch (Larix sibirica Ledeb) [39] have had their giga-genomes sequenced. These genome sequences offer an opportunity to identify millions of qualified SNPs through resequencing projects, and recently this has been initiated in loblolly pine by resequencing 10 different individual genomes [40].…”
Section: Discussionmentioning
confidence: 99%
“…menziesii (Mirb.) Franco) [38], and Siberian larch (Larix sibirica Ledeb) [39] have had their giga-genomes sequenced. These genome sequences offer an opportunity to identify millions of qualified SNPs through resequencing projects, and recently this has been initiated in loblolly pine by resequencing 10 different individual genomes [40].…”
Section: Discussionmentioning
confidence: 99%
“…We used the Trinity software program to de novo assemble the high-quality sequencing data [19], which produced 120,951 transcripts with an N50 length of 1,467 bp and a mean length of 871 bp. The length distribution of the transcripts is shown in Supplemental Figure S1A and Table S2.…”
Section: Rna-seq and Transcriptome Assemblymentioning
confidence: 99%
“…Recent studies have shown that RNA-Seq is a very effective and powerful technique for gathering extensive transcriptome information in many plants [5][6][7][8]. When compared with other hardwood species [9][10][11][12][13][14][15], conifers tend to have larger and highly repetitive genomes [16][17][18][19][20][21]. Sequencing the P. neoveitchii genome remains expensive, even when using highthroughput Illumina sequencing technology.…”
Section: Introductionmentioning
confidence: 99%
“…At the same time, it contained a very high fraction of contigs containing complete open reading frames, as revealed by the detection of 89.7% complete and just 4.1% fragmented tetrapod BUSCOs. Overall, the quality of the C. orientalis transcriptome, both in terms of number of orthologs present and in terms of complete BUSCOs identified, was significantly higher than most available amphibian reference transcriptomes 6,[21][22][23] . Completeness and fragmentation rates were comparable with the outcome of the most recent high-depth approaches targeting multiple tissues [24][25][26] (Fig.…”
Section: Resultsmentioning
confidence: 87%
“…These factors currently represent a significant challenge for whole genome assembly 5 , in particular due to the high computational resources required for handling the large amount of input sequencing data and the technical limitations of most assembly algorithms, optimized for genomes with a size comparable with human or smaller. The only exception was the axolotl genome (~32 Gb) 6 , which has been produced through a huge joint effort and at high costs, together with the development of ad hoc bioinformatics tools and computational strategies to manage the unusual size of the assembly.…”
mentioning
confidence: 99%