The XtreemFS architecture—a case for object‐based file systems in Grids

“…Traditional parallel or distributed file systems such as GPFS [3], PVFS [4], Lustre [9], GlusterFS [10], XtreemFS [11], and Ceph [12] are generally deployed statically on (a subset of) the nodes of a cluster/supercomputer. Such storage systems provide consistency, durability and fault-tolerance.…”

Section: Background and Related Workmentioning

confidence: 99%

Overcoming data locality: An in-memory runtime file system with symmetrical data distribution

Uta

Sandu

Kielmann

2016

Future Generation Computer Systems

View full text Add to dashboard Cite

“…A very large distributed storage space is thus made available to applications that usually use file storage, with no need for modifications. This approach has been taken by a few projects like GFarm [34], GridNFS [13], GPFS [29], XtreemFS [14], etc. Implementing transparent access at a global scale naturally leads however to a number of challenges related to scalability and performance, as the file system is put under pressure by a very large number of concurrent accesses.…”

Section: Related Workmentioning

confidence: 99%

BlobSeer: Next-generation data management for large scale infrastructures

Nicolae

Antoniu

Bougé

et al. 2011

Journal of Parallel and Distributed Computing

107

View full text Add to dashboard Cite

As data volumes increase at a high speed in more and more application fields of science, engineering, information services, etc., the challenges posed by data-intensive computing gain an increasing importance. The emergence of highly scalable infrastructures, e.g. for cloud computing and for petascale computing and beyond introduces additional issues for which scalable data management becomes an immediate need. This paper brings several contributions. First, it proposes a set of principles for designing highly scalable distributed storage systems that are optimized for heavy data access concurrency. In particular, we highlight the potentially large benefits of using versioning in this context. Second, based on these principles, we propose a set of versioning algorithms, both for data and metadata, that enable a high throughput under concurrency. Finally, we implement and evaluate these algorithms in the BlobSeer prototype, that we integrate as a storage backend in the Hadoop MapReduce framework. We perform extensive microbenchmarks as well as experiments with real MapReduce applications: they demonstrate that applying the principles defended in our approach brings substantial benefits to data intensive applications.

show abstract

The XtreemFS architecture—a case for object‐based file systems in Grids

Cited by 80 publications

References 15 publications

Maintaining a Science Gateway - Lessons Learned from MoSGrid

Maintaining a Science Gateway - Lessons Learned from MoSGrid

Overcoming data locality: An in-memory runtime file system with symmetrical data distribution

BlobSeer: Next-generation data management for large scale infrastructures

Contact Info

Product

Resources

About