2019
DOI: 10.1038/s41598-019-41502-9
|View full text |Cite
|
Sign up to set email alerts
|

Improving in-silico normalization using read weights

Abstract: Specialized de novo assemblers for diverse datatypes have been developed and are in widespread use for the analyses of single-cell genomics, metagenomics and RNA-seq data. However, assembly of large sequencing datasets produced by modern technologies is challenging and computationally intensive. In-silico read normalization has been suggested as a computational strategy to reduce redundancy in read datasets, which leads to significant speedups and memory savings of assembly pipelines. Previously, we presented … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 26 publications
1
15
0
Order By: Relevance
“…The outcome is a strong reduction of the read volume in such a manner that full length reconstruction of a large majority of the transcript cohort can be achieved despite fewer reads being input to the assembler [ 45 , 46 ]. Some tools that can perform in silico read normalization include (using the algorithm) [ 47 ], [ 48 ], [ 49 ] and [ 50 ]. The [ 46 ] assembler also offers in-built in silico normalization [ 45 , 46 ].…”
Section: Pre-assembly Quality Control and Filteringmentioning
confidence: 99%
“…The outcome is a strong reduction of the read volume in such a manner that full length reconstruction of a large majority of the transcript cohort can be achieved despite fewer reads being input to the assembler [ 45 , 46 ]. Some tools that can perform in silico read normalization include (using the algorithm) [ 47 ], [ 48 ], [ 49 ] and [ 50 ]. The [ 46 ] assembler also offers in-built in silico normalization [ 45 , 46 ].…”
Section: Pre-assembly Quality Control and Filteringmentioning
confidence: 99%
“…Quality controlled reads were error corrected and digitally normalized to a target coverage of 100× using BBnorm (BBtools suite). Digital normalization was performed to remove redundant reads, thus reducing memory requirements on the assembly step [27,28]. The k-mer abundances on the sequencing reads were calculated, and reads with an estimated mean coverage over a user-defined threshold were rejected.…”
Section: Quality Control and Assemblymentioning
confidence: 99%
“…Paired-end reads were de novo assembled using fq2dna version 21.06 ( https://gitlab.pasteur.fr/GIPhy/fq2dna; strategy B; default settings). The corresponding fq2dna pipeline consists of trimming and clipping of low-quality reads and adapters with AlienTrimmer (version 2.0) [ 77 ], sequencing error correction with Musket (version 1.1) [ 78 ], paired-end read merging with FLASh (version 1.2.11) [ 79 ], coverage homogenization with ROCK (version 1.9.3; https://gitlab.pasteur.fr/vlegrand/ROCK ) [ 81 , 82 , 90 ], and de novo assembly with SPAdes (version 3.15.0) [ 57 ]. In brief, the paired-end reads were first pre-processed through deduplication, clipping, trimming (Phred score threshold: 15, minimum read length: 50 bp) and error correction.…”
Section: Methodsmentioning
confidence: 99%