2012 International Conference for High Performance Computing, Networking, Storage and Analysis 2012
DOI: 10.1109/sc.2012.110
|View full text |Cite
|
Sign up to set email alerts
|

Usage behavior of a large-scale scientific archive

Abstract: Abstract-Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of filelevel activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data.Our examination of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…The definitions for cold storage and archival storage suggest a data set that may be accessed less frequently than hot or high intensity storage systems; however, there is no exact definition for an archival storage system. Previous studies have analyzed storage system behavior under different workloads, including both archival and high intensity workloads [5], [8], [9]. The challenge of rigorously characterizing storage system workloads does not outweigh the importance of using workload characteristics to understand how best to model and design storage systems [10].…”
Section: Archive Parametersmentioning
confidence: 99%
“…The definitions for cold storage and archival storage suggest a data set that may be accessed less frequently than hot or high intensity storage systems; however, there is no exact definition for an archival storage system. Previous studies have analyzed storage system behavior under different workloads, including both archival and high intensity workloads [5], [8], [9]. The challenge of rigorously characterizing storage system workloads does not outweigh the importance of using workload characteristics to understand how best to model and design storage systems [10].…”
Section: Archive Parametersmentioning
confidence: 99%
“…Because of this difference in log/trace format, system analysis is usually done per system. Such analysis can produce useful results on the behavior of a particular system, but sheds no light on how it compares to other similar systems, HPC or otherwise [6][7][8][9]. Additionally, programmer effort is wasted analyzing each system when the generated analytics are the same or similar for each system.…”
Section: Analysis Of Datamentioning
confidence: 99%
“…Additional tools of note are the Integrated Performance Monitoring for HPC 6 , IOSIG [19], RIOT [20], ScalaIOTrace [5], and Linux blktrace.…”
Section: Additional User-level Toolsmentioning
confidence: 99%
“…Identifying working sets accurately and reliably can greatly improve both the performance and the efficiency of storage systems [3]. Additionally, understanding workload characteristics is essential for optimal storage management and provisioning [1].…”
Section: Introductionmentioning
confidence: 99%