Proceedings of the Sixth Workshop on Parallel Data Storage 2011
DOI: 10.1145/2159352.2159359
|View full text |Cite
|
Sign up to set email alerts
|

Easing the burdens of HPC file management

Abstract: While the amount of data we can process and store grows, our ability to find data remains dependent upon our own memories more often than not. Manual metadata management is common among scientific users, consuming their time while not making use of the computing resources at hand. Our system design proposes to empower users with more powerful data finding tools, such as unified search spaces, provenance, and ranked file system search. By returning the responsibility of file management to the file system, we en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…One such application, file-per-process (N-N) checkpointing, requires the metadata service to handle a huge number of file creates all at the beginning of the checkpoint [9]. Another example, storage management, produces a read-intensive metadata workload that typically scans the metadata of the entire file system to perform administrative tasks [28], [30]. Finally, even in the era of big data, most files in even the largest cluster file systems are small [19], [61], where median file size is often only hundreds of kilobytes.…”
Section: Introductionmentioning
confidence: 99%
“…One such application, file-per-process (N-N) checkpointing, requires the metadata service to handle a huge number of file creates all at the beginning of the checkpoint [9]. Another example, storage management, produces a read-intensive metadata workload that typically scans the metadata of the entire file system to perform administrative tasks [28], [30]. Finally, even in the era of big data, most files in even the largest cluster file systems are small [19], [61], where median file size is often only hundreds of kilobytes.…”
Section: Introductionmentioning
confidence: 99%
“…The most similar work to TrueNames is that of Jones et al [17], who proposed a non-hierarchical HPC file system with automatically generated file names, chosen by examining the distribution of metadata fields. By contrast, our work uses a more robust and less complex scheme which puts the user and application in control of which metadata is used, and allows them to select attributes which are most appropriate for the file's semantic type, rather than relying on statistical techniques.…”
Section: Non-hierarchical and Semantic File Systemsmentioning
confidence: 99%
“…One such example, checkpointing, requires the metadata service to handle large number of file creates and updates at very high speeds [6]. Another example, storage management, produces readintensive metadata workload that typically scans the metadata of the entire file system to perform administration tasks for analyzing and querying metadata [11], [13].…”
Section: Introductionmentioning
confidence: 99%