2013
DOI: 10.1002/cpe.3013
|View full text |Cite
|
Sign up to set email alerts
|

Advancing next‐generation sequencing data analytics with scalable distributed infrastructure

Abstract: SUMMARYWith the emergence of popular next-generation sequencing (NGS)-based genome-wide protocols such as chromatin immunoprecipitation followed by sequencing (ChIP-Seq) and RNA-Seq, there is a growing need for research and infrastructure to support the requirement of effectively analyzing NGS data. Such research and infrastructure do not replace but complement algorithmic advances developments in analyzing NGS data. We present a runtime environment, Distributed Application Runtime Environment, that supports t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2013
2013
2016
2016

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 43 publications
0
6
0
Order By: Relevance
“…It comprises high-level and easy-to-use APIs for accessing distributed resources and provisioning of job submission, monitoring, and more. It has been successfully utilized for efficient executions of loosely coupled and embarrassingly parallel applications on distributed cyberinfrastructure [ 25 , 26 ]. BigJob has three major components.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It comprises high-level and easy-to-use APIs for accessing distributed resources and provisioning of job submission, monitoring, and more. It has been successfully utilized for efficient executions of loosely coupled and embarrassingly parallel applications on distributed cyberinfrastructure [ 25 , 26 ]. BigJob has three major components.…”
Section: Methodsmentioning
confidence: 99%
“…Also, cloud environments are increasingly becoming popular as a solution for massive data management, processing, and analysis [ 19 , 20 , 24 ]. Previously, SAGA-Pilot-based MapReduce and data parallelization strategies were demonstrated for life science problems, in particular, such as alignment of NGS reads [ 20 , 25 , 26 ]. Despite the successful cloud-oriented implementations of various bioinformatics tools, significantly fewer studies focused on the porting of complex structural bioinformatics algorithms to distributed computing platforms.…”
Section: Introductionmentioning
confidence: 99%
“…RADICAL-SAGA is currently used by a wide range of projects ranging from tool developers to individual users. Illustrative but not exhaustive examples include the PANDA group (ATLAS project) [39], Gateways projects [40][41][42], as well as multiple Bioinformatics projects [43,44]. A combination of standards-based, open source and distributed development, supported by multiple independent contributors, has contributed to the sustainability of RADICAL-SAGA.…”
Section: The Radical-saga Communitymentioning
confidence: 98%
“…Jha et al . presents a runtime environment, Distributed Application Runtime Environment (DARE), that supports the scalable, flexible, and extensible composition of capabilities exploring the interoperability among heterogeneously distributed computing environments for pleasingly parallel applications. DARE is a SAGA‐BigJob‐based framework motivated by the next‐generation sequencing (NGS) analysis and other similar data‐intensive applications.…”
Section: Special Issue Papersmentioning
confidence: 99%
“…Relevant contributions have been provided by Mitchel et al [1], by Lanc et al [2], by Luo et al [3], by Jha et al [4], by Ellingson et al [5], and by Schatz [6]. These contributions focus on the following:…”
Section: Introductionmentioning
confidence: 99%