2011
DOI: 10.1007/978-3-642-22351-8_29
|View full text |Cite
|
Sign up to set email alerts
|

Pantheon: Exascale File System Search for Scientific Computing

Abstract: Abstract. Modern scientific computing generates petabytes of data in billions of files that must be managed. These files are often organized, by name, in a hierarchical directory tree common to most file systems. As the scale of data has increased, this has proven to be a poor method of file organization. Recent tools have allowed for users to navigate files based on file metadata attributes to provide more meaningful organization. In order to search this metadata, it is often stored on separate metadata serve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2011
2011
2013
2013

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…We see several ways that query optimization could assist with file system search. In previous work [5], we displayed that including a basic selectivity based query optimizer can provide significant decreases in query response time versus naïve query evaluation. This is an extremely simplistic implementation, with much room to experiment with both existing database query optimization models, and totally new, storage system specific models.…”
Section: Query Optimizationmentioning
confidence: 98%
See 1 more Smart Citation
“…We see several ways that query optimization could assist with file system search. In previous work [5], we displayed that including a basic selectivity based query optimizer can provide significant decreases in query response time versus naïve query evaluation. This is an extremely simplistic implementation, with much room to experiment with both existing database query optimization models, and totally new, storage system specific models.…”
Section: Query Optimizationmentioning
confidence: 98%
“…Instead of placing the database as another layer, sitting on top of the storage system, we instead look to move database components into the storage system itself. We have started this effort with our Pantheon system [5]. This work is based on the observation that, in order for the high performance computing community to fully leverage what databases have to offer, databases must act as more than just metadata servers.…”
Section: Introductionmentioning
confidence: 99%
“…While these performed well on their test data, they focused strictly on POSIX metadata. Loris [27] and Pantheon [21] were both indexing systems tested for system metadata only. Pantheon used B-trees, which are row-based, and will face challenges with sparse data.…”
Section: File System Indexingmentioning
confidence: 99%
“…While there are many workload studies for file system metadata [14,9,8,20,13,28], they have focused on POSIX metadata. Search systems based on them [19,17,27,21] attempt to extrapolate performance for other use cases. By contrast, we examine scientific metadata directly, in order to better understand the design space of scientific metadata and content indexing systems.…”
Section: Introductionmentioning
confidence: 99%
“…SciDB requires the scientific data to be loaded into the database and then use query languages, called Array Query Language (AQL) and Array Functional Language (AFL), to access data. Instead of developing a new database system, there are also efforts to modify file systems with high-level semantics [5], [15], [19]. While these efforts are likely to be accepted by a few scientific communities, we believe that the array data model needs to be supported as a first class citizen instead of being supported through layers of metadata.…”
mentioning
confidence: 99%