2017
DOI: 10.1186/s12859-017-1881-8
|View full text |Cite
|
Sign up to set email alerts
|

K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity

Abstract: BackgroundDe novo transcriptome assembly is an important technique for understanding gene expression in non-model organisms. Many de novo assemblers using the de Bruijn graph of a set of the RNA sequences rely on in-memory representation of this graph. However, current methods analyse the complete set of read-derived k-mer sequence at once, resulting in the need for computer hardware with large shared memory.ResultsWe introduce a novel approach that clusters k-mers as the first step. The clusters correspond to… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 46 publications
0
6
0
Order By: Relevance
“…After obtaining high-quality sequencing data, Trinity was employed for sequence assembly. The k-mer library was constructed by interrupting the reads according to the short k-mer fragment (k-mer) [ 58 ]. We used HISAT2 v2.1.0 to align the quality-controlled transcriptome reads to the yellow mushroom ( Floccularia luteovirens ) genome ( (accessed on 15 October 2019)).…”
Section: Methodsmentioning
confidence: 99%
“…After obtaining high-quality sequencing data, Trinity was employed for sequence assembly. The k-mer library was constructed by interrupting the reads according to the short k-mer fragment (k-mer) [ 58 ]. We used HISAT2 v2.1.0 to align the quality-controlled transcriptome reads to the yellow mushroom ( Floccularia luteovirens ) genome ( (accessed on 15 October 2019)).…”
Section: Methodsmentioning
confidence: 99%
“…Kim et al . 7 proposed a modification to the Trinity assembler by pre-clustering the input k -mers using the MapReduce framework, which works if a suitable infrastructure is available. Techniques such as entropy based compression 8 can also be used to reduce the memory required to store k -mers.…”
Section: Introductionmentioning
confidence: 99%
“…Regarding the practical applications of the constructed MISs, one potentiality is their usage in k-merbased reads clustering problems [6,7], where the choice of centers is crucial. Because an MIS contains a group of representatives for the k-mer space that are guaranteed to be a certain edit distance apart, it is ideal for generating even-sized clusters.…”
Section: Conclusion and Discussionmentioning
confidence: 99%