Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems 2015
DOI: 10.1145/2831244.2831253
|View full text |Cite
|
Sign up to set email alerts
|

Big data analytics on traditional HPC infrastructure using two-level storage

Abstract: Data-intensive computing has become one of the major workloads on traditional high-performance computing (HPC) clusters. Currently, deploying data-intensive computing software framework on HPC clusters still faces performance and scalability issues. In this paper, we develop a new two-level storage system by integrating Tachyon, an in-memory file system with OrangeFS, a parallel file system. We model the I/O throughputs of four storage structures: HDFS, OrangeFS, Tachyon and two-level storage. We conduct compu… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“… 38 Traditionally, high-performance computing required specialized on-premises infrastructure with powerful hardware and dedicated resources. 39 Now, cloud computing can provide the high-performance computing and storage capabilities needed for processing such big data, 40 benefiting DEWS 4.0 by enabling efficient and timely data analysis.…”
Section: Interconnected Framework Modulesmentioning
confidence: 99%
“… 38 Traditionally, high-performance computing required specialized on-premises infrastructure with powerful hardware and dedicated resources. 39 Now, cloud computing can provide the high-performance computing and storage capabilities needed for processing such big data, 40 benefiting DEWS 4.0 by enabling efficient and timely data analysis.…”
Section: Interconnected Framework Modulesmentioning
confidence: 99%
“…Results indicated that BDA platforms still suffered from the reduced locality offered by such a setting. Consequently, the authors in [105] proposed a two-layer storage system that exploits PFS performance but incorporates an intermediate in-memory storage system, with good results.…”
Section: ) Infrastructure: Networking and Acceleratorsmentioning
confidence: 99%
“…We made the choice to use a DFS to avoid the traditional HPC IO bottleneck and scale linearly in terms of bandwidth [20], without adding expensive hardware to scale up the PFS [8].…”
Section: Related Workmentioning
confidence: 99%