2012
DOI: 10.1145/2180905.2180907
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories

Abstract: The scope of archival systems is expanding beyond cheap tertiary storage: scientific and medical data is increasingly digital, and the public has a growing desire to digitally record their personal histories. Driven by the increased cost efficiency of hard drives compared to tape, and the rise of the Internet, content archives have become a means of providing the public with fast, cheap access to long-term data. Unfortunately, designers of purpose-built archival systems are either forced to rely on workload be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
26
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 38 publications
(27 citation statements)
references
References 24 publications
1
26
0
Order By: Relevance
“…Our file-archive dataset is a database of vital records from the Washington state digital archives, where records are labeled with one of many type identifiers (e.g., "Birth Records," "Marriage Records") [Adams et al 2012]. We examined 5,321,692 accesses from 2007 through 2010 that were made to a 16.5TB database.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Our file-archive dataset is a database of vital records from the Washington state digital archives, where records are labeled with one of many type identifiers (e.g., "Birth Records," "Marriage Records") [Adams et al 2012]. We examined 5,321,692 accesses from 2007 through 2010 that were made to a 16.5TB database.…”
Section: Methodsmentioning
confidence: 99%
“…Categorical grouping by definition requires some functional knowledge of the data that relies on human curation, either in manual labeling or metadata upkeep [Adams et al 2012]. Example categorical attributes include size, name, type, owner, path, or even whole file content.…”
Section: Statistical Grouping Versus Categorical Groupingmentioning
confidence: 99%
See 2 more Smart Citations
“…Since 1993, there have been only two studies that have looked explicitly at day-to-day activities on a large-scale scientific archive, each limited in their application to modern scientific archive design. Adams et al were stymied by the coarse granularity of their data, preventing them from examining user and file level behaviors [1]. Frank et al focused on evolutionary trends over time rather than a detailed analysis of a modern system's behavior [2].…”
Section: Introductionmentioning
confidence: 99%