2009
DOI: 10.1007/s12145-009-0021-z
|View full text |Cite
|
Sign up to set email alerts
|

Efficient clustered server-side data analysis workflows using SWAMP

Abstract: Technology continues to enable scientists to set new records in data collection and production, intensifying a need for large scale tools to efficiently process and analyze the growing mountain of data. To complement growth in the number of data centers and the volume of data they store, we introduce our Script Workflow Analysis for MultiProcessing (SWAMP) system. Our system provides safe server-side processing capabilities that allow scientists to reuse familiar desktop-based analysis methods represented in s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0
1

Year Published

2011
2011
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 25 publications
0
3
0
1
Order By: Relevance
“…An increasingly large number of workflows are available today to manage high-throughput genomics sequencing data, from basic data processing to high-quality visualization of results. Examples include shell-scripts [ 58 , 59 ], tool-specific graphical interfaces [ 60 , 61 ], and graphical workflow environments [ 62 , 63 , 64 , 65 ]. The graphical workflow environment are emerging as useful for constructing, modifying, interconnecting and executing computational genomics protocols using data processing workflows, also described as “pipelines” once the processes have been connected ( Table 2 ).…”
Section: Review Of the Current Methodologies And Tools For Ngs Dnamentioning
confidence: 99%
“…An increasingly large number of workflows are available today to manage high-throughput genomics sequencing data, from basic data processing to high-quality visualization of results. Examples include shell-scripts [ 58 , 59 ], tool-specific graphical interfaces [ 60 , 61 ], and graphical workflow environments [ 62 , 63 , 64 , 65 ]. The graphical workflow environment are emerging as useful for constructing, modifying, interconnecting and executing computational genomics protocols using data processing workflows, also described as “pipelines” once the processes have been connected ( Table 2 ).…”
Section: Review Of the Current Methodologies And Tools For Ngs Dnamentioning
confidence: 99%
“…Penelitian (Wang, Zender, & Jenks, 2009) membahas tentang teknologi yang terus memungkinkan para ilmuwan untuk mengatur arsip baru dalam pengumpulan dan produksi data, mengintensifkan kebutuhan alat skala besar untuk secara efisien memproses dan menganalisa data besar yang terus meningkat.…”
Section: A Literatur Reviewunclassified
“…There are several alternative approaches for high-throughput analysis of large amounts of data such as using various types of shell-scripts [11,12] and employing tool-specific graphical interfaces [13-15]. The large-scale parallelization, increased network bandwidth, need for reproducibility, and wide proliferation of efficient and robust computational and communication resources are the driving forces behind this need for automation and high-throughput analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Contemporary informatics and genomic research require efficient, flexible and robust management of large heterogeneous data [ 1 , 2 ], advanced computational tools [ 3 ], powerful visualization [ 4 ], reliable hardware infrastructure [ 5 ], interoperability of computational resources [ 6 , 7 ], and provenance of data and protocols [ 8 - 10 ]. There are several alternative approaches for high-throughput analysis of large amounts of data such as using various types of shell-scripts [ 11 , 12 ] and employing tool-specific graphical interfaces [ 13 - 15 ]. The large-scale parallelization, increased network bandwidth, need for reproducibility, and wide proliferation of efficient and robust computational and communication resources are the driving forces behind this need for automation and high-throughput analysis.…”
Section: Introductionmentioning
confidence: 99%