2018
DOI: 10.1145/3149376
|View full text |Cite
|
Sign up to set email alerts
|

Challenges and Solutions for Tracing Storage Systems

Abstract: IBM Spectrum Scale’s parallel file system General Parallel File System (GPFS) has a 20-year development history with over 100 contributing developers. Its ability to support strict POSIX semantics across more than 10K clients leads to a complex design with intricate interactions between the cluster nodes. Tracing has proven to be a vital tool to understand the behavior and the anomalies of such a complex software product. However, the necessary trace information is often buried in hundreds of gigabytes of by-p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…Since the arrival of parallel file systems over two decades ago, file systems have continued to become increasingly more complex, spanning intricate logic over million of lines of code [79] with the goal of offering a general-purpose solution for all applications (see Section 2). In order to allow for a familiar and portable interface that most applications can agree and rely on, viding a thinner set of API functions that may ease deployment and maintenance.…”
Section: Interfacing Ad Hoc File Systemsmentioning
confidence: 99%
“…Since the arrival of parallel file systems over two decades ago, file systems have continued to become increasingly more complex, spanning intricate logic over million of lines of code [79] with the goal of offering a general-purpose solution for all applications (see Section 2). In order to allow for a familiar and portable interface that most applications can agree and rely on, viding a thinner set of API functions that may ease deployment and maintenance.…”
Section: Interfacing Ad Hoc File Systemsmentioning
confidence: 99%
“…However, inodes and directory blocks were not designed for parallel accesses because a single block can only be accessed by one process at a time. This is particularly relevant in distributed systems when a huge number of files is created in a single directory from multiple processes, a common workload in HPC environments [12,25,31,32]. In general, such systems distribute data across all available storage targets.…”
Section: Related Workmentioning
confidence: 99%
“…The performance limitation can be attributed to the sequentialization enforced by underlying POSIX semantics which is particularly degrading throughput when a huge number of files is created in a single directory from multiple processes. This workload, common to HPC environments [3], [24], [25], [37], can become an even bigger challenge for upcoming data-science applications. GekkoFS is built on a new technique to handle directories and replaces directory entries by objects, stored within a strongly consistent key-value store which helps to achieve tens of millions of metadata operations for billions of files.…”
Section: Related Workmentioning
confidence: 99%
“…Although GekkoFS and Lustre have different goals, we point out the performances that can be gained by using GekkoFS as a burst buffer file system. In our experiments, mdtest performs create, stat, and remove operations in parallel in a single directory -an important workload in many HPC applications and among the most difficult workloads for a general-purpose PFS [37].…”
Section: A Metadata Performancementioning
confidence: 99%