2017
DOI: 10.1142/s0219720017400030
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale parallel genome assembler over cloud computing environment

Abstract: The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 20 publications
0
2
0
Order By: Relevance
“…We expect that the performance of ParLECH can be significantly improved by using multiple HDDs per node and/or SSD. Our previous work [31–33] demonstrates the effects of various computing environments for large-scale data processing.…”
Section: Resultsmentioning
confidence: 99%
“…We expect that the performance of ParLECH can be significantly improved by using multiple HDDs per node and/or SSD. Our previous work [31–33] demonstrates the effects of various computing environments for large-scale data processing.…”
Section: Resultsmentioning
confidence: 99%
“…For example, new generation DNA sequencers can now produce a large amount of data, in the terabyte range, at a very low cost. Analyzing these required the development of computing tools for large scale sequence alignment, which relies upon HPC resources (Das et al, 2017;Yu et al, 2017). Similarly, the reconstruction of Gene Regulatory Networks (GRNs) from high-throughput experimental data comes with a high computational cost, leading to the development of parallel algorithms (Xiao et al, 2015;Zheng et al, 2016;Lee et al, 2014).…”
Section: Introductionmentioning
confidence: 99%