Proceedings of the 2003 ACM/IEEE Conference on Supercomputing 2003
DOI: 10.1145/1048935.1050203
|View full text |Cite
|
Sign up to set email alerts
|

An Efficient Data Location Protocol for Self.organizing Storage Clusters

Abstract: Component additions and failures are common for large-scale storage clusters in production environments. To improve availability and manageability, we investigate and compare data location schemes for a large self-organizing storage cluster that can quickly adapt to the additions or departures of storage nodes. We further present an efficient location scheme that differentiates between small and large file blocks for reduced management overhead compared to uniform strategies. In our protocol, small blocks, whi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2006
2006
2013
2013

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(18 citation statements)
references
References 48 publications
0
18
0
Order By: Relevance
“…tolerance [1,4,6,7,9,10,12,17,24,25]. These systems do not consider the impact of the likely dual-role characteristics of cluster nodes, serving both as a compute node and also as a data server.…”
mentioning
confidence: 99%
“…tolerance [1,4,6,7,9,10,12,17,24,25]. These systems do not consider the impact of the likely dual-role characteristics of cluster nodes, serving both as a compute node and also as a data server.…”
mentioning
confidence: 99%
“…SCAD-DAR uses remapping functions similar in flavor to those in RUSH, but does not support replication beyond simple offset-based replication as discussed in Section 3.3. Consistent hashing [11] has many of the qualities of RUSH, but has a high migration overhead and is less well-suited to read-write file systems; Tang and Yang [18] use consistent hashing as part of a scheme to distribute data in large-scale storage clusters.…”
Section: Related Workmentioning
confidence: 99%
“…In these experiments, we only added new objects without deleting any existing items so that δ 0 is kept zero. The experiments presented in Table 2 considers both the deletion and addition of objects on each host when the initial state of BF on each host is optimized, this is, the number of hash functions is the optimal under the ratio between m and the initial number of objects n. This specific setting aims to emulate the real application where m/n and k are usually optimally or sub-optimally matched by dynamically adjusting the BF length m [3] or designing the BF length according to the average number of objects [12,6,2,4,5]. All the analytical results have been very closely matched by their real (experimental) counterparts consistently, strongly validating our theoretical models.…”
Section: Validation Of the Theoretic Models Via Experimentsmentioning
confidence: 99%
“…Ref. [3,4,5,6] use BFs to implement the function of mapping logical data identities to their physical locations in distributed storage systems. In such schemes, each storage node constructs a Bloom filter that summarizes the identities of data stored locally and broadcasts it to other nodes.…”
Section: Introductionmentioning
confidence: 99%