2017
DOI: 10.1101/149948
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity

Abstract: BackgroundDe novo transcriptome assembly is an important technique for understanding gene expression in non-model organisms. Many de novo assemblers using the de Bruijn graph of a set of the RNA sequences rely on in-memory representation of this graph. However, current methods analyse the complete set of read-derived k-mer sequence at once, resulting in the need for computer hardware with large shared memory.ResultsWe introduce a novel approach that clusters k-mers as the first step. The clusters correspond to… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 44 publications
0
1
0
1
Order By: Relevance
“…Kim et al . 7 proposed a modification to the Trinity assembler by pre-clustering the input k -mers using the MapReduce framework, which works if a suitable infrastructure is available. Techniques such as entropy based compression 8 can also be used to reduce the memory required to store k -mers.…”
Section: Introductionmentioning
confidence: 99%
“…Kim et al . 7 proposed a modification to the Trinity assembler by pre-clustering the input k -mers using the MapReduce framework, which works if a suitable infrastructure is available. Techniques such as entropy based compression 8 can also be used to reduce the memory required to store k -mers.…”
Section: Introductionmentioning
confidence: 99%
“…2) или выполнить de novo сборку. Для этой цели в биоинформатике применяют программное обеспечение rnaSPADes [24], Trinity [25][26][27][28], Oases [29,30], SOAPdenovo-trans, Abyss [31][32][33], NextGene Floton и другие. Более длинные прочтения или прочтения с парными концами (как при секвенировании обеих цепей ДНК) способствуют получению лучших результатов de novo сборки.…”
Section: картирование/выравнивание прочтений и De Novo сборкаunclassified