This publication reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein.
There has been considerable interest recently in the use of highly-available configuration management services based on the Paxos family of algorithms to address long-standing problems in the management of large-scale heterogeneous distributed systems. These problems include providing distributed locking services, determining group membership, electing a leader, managing configuration parameters, etc. While these services are finding their way into the management of distributed middleware systems and data centers in general, there are still areas of applicability that remain largely unexplored. One such area is the management of metadata in distributed file systems. In this paper we show that a Paxos-based approach to building metadata services in distributed file systems can achieve high availability without incurring a performance penalty. Moreover, we demonstrate that it is easy to retrofit such an approach to existing systems (such as PVFS and HDFS) that currently use different approaches to availability. Our overall approach is based on the use of a general-purpose Paxos-compatible component (the embedded Oracle Berkeley database) along with a methodology for making it interoperate with existing distributed file system metadata services.
Resource allocation policies in public Clouds are today largely agnostic to requirements that distributed applications have from their underlying infrastructure. As a result, assumptions about data-center topology that are built-into distributed data-intensive applications are often violated, impacting performance and availability goals. In this paper we describe a management system that discovers a limited amount of information about Cloud allocation decisions -in particular VMs of the same user that are collocated on a physical machine-so that data-intensive applications can adapt to those decisions and achieve their goals. Our distributed discovery process is based on either application-level techniques (measurements) or a novel lightweight and privacy-preserving Cloud management API proposed in this paper. Using the distributed Hadoop file system as a case study we show that VM collocation in a Cloud setup occurs in commercial platforms and that our methodologies can handle its impact in an effective, practical, and scalable manner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.