2007
DOI: 10.1145/1288783.1288788
|View full text |Cite
|
Sign up to set email alerts
|

A five-year study of file-system metadata

Abstract: For five years, we collected annual snapshots of file-system metadata from over 60,000 Windows PC file systems in a large corporation. In this article, we use these snapshots to study temporal changes in file size, file age, file-type frequency, directory size, namespace structure, file-system population, storage capacity and consumption, and degree of file modification. We present a generative model that explains the namespace structure and the distribution of directory sizes. We find significant temporal tre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
169
1

Year Published

2008
2008
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 236 publications
(176 citation statements)
references
References 22 publications
6
169
1
Order By: Relevance
“…The distributions from 1993 and 2005 shown in Figure 9.13 are remarkably similar, with just a slight shift to higher values. Other comparisons of the file-size distributions at the same location over a 20-year span (1984 and 2005) and a 4-year span (2000)(2001)(2002)(2003)(2004) reached essentially the same conclusion [673,18].…”
Section: The Distribution Of File Sizesmentioning
confidence: 61%
See 1 more Smart Citation
“…The distributions from 1993 and 2005 shown in Figure 9.13 are remarkably similar, with just a slight shift to higher values. Other comparisons of the file-size distributions at the same location over a 20-year span (1984 and 2005) and a 4-year span (2000)(2001)(2002)(2003)(2004) reached essentially the same conclusion [673,18].…”
Section: The Distribution Of File Sizesmentioning
confidence: 61%
“…The model can also be extended to allow for link deletions [255]. Essentially the same model has also been suggested in other domains (e.g., the creation of subdirectories in a file system [18]). …”
Section: Preferential Attachmentmentioning
confidence: 98%
“…Therefore, we may relax the consistency model for media files as they are often non-editable, while the replicas of other documents have to be consistent as they are frequently edited. In addition to that, fetching the document files on demand is not expensive compared to that for media files as the media files size [16] is often bigger than that of documents files [21] on average.…”
Section: Consistency and Availabilitymentioning
confidence: 99%
“…Thus, the access to update rate is often high for these files. Moreover, the size of these media files is often larger than the editable files such as doc or xml [16] [21]. Caching and replicating these files on every possible device improves the access time to those files and results in less network traffic, but it also increases the overhead of maintaining consistency between copies.…”
Section: High Availability With Weak Consistencymentioning
confidence: 99%
“…Moreover, the recent price drop for magnetic disk drives [1] has accelerated the explosive increase in the number of files within typical file systems [2].…”
Section: Introductionmentioning
confidence: 99%