Distributed storage systems often employ erasure codes to achieve high data reliability while attaining space efficiency. Such storage systems are known to be susceptible to long tails in response time. It has been shown that in modern online applications such as Bing, Facebook, and Amazon, the long tail of latency is of particular concern, with 99.9th percentile response times that are orders of magnitude worse than the mean. Taming tail latency is very challenging in erasure-coded storage systems since quantify tail latency (i.e., xth-percentile latency for arbitrary x ∈ [0, 1]) has been a long-standing open problem. In this paper, we propose a mathematical model to quantify tail index of service latency for arbitrary erasure-coded storage systems, by characterizing the asymptotic behavior of latency distribution tails. When file size has a heavy tailed distribution, we find tail index, defined as the exponent at which latency tail probability diminishes to zero, in closed-form, and further show that a family of probabilistic scheduling algorithms are (asymptotically) optimal since they are able to achieve the exact tail index.
Storage clouds, such as Amazon S3, are being widely used for web services and Internet applications. It has been observed that the delay for retrieving data from and placing data into the clouds is quite random, and exhibits weak correlations between different read/write requests. This inspires us to investigate a key problem: can we reduce the delay by transmitting data replications in parallel or using powerful erasure codes?In this paper, we study the problem of reducing the delay of downloading data from cloud storage systems by leveraging multiple parallel threads, assuming that the data has been encoded and stored in the clouds using fixed rate forward error correction (FEC) codes with parameters (n, k). That is., each file is divided into k equal-sized chunks, which are then expanded into n chunks such that any k chunks out of the n are sufficient to successfully restore the original file. The model can be depicted as a multipleserver queue with arrivals of data retrieving requests and a server corresponding to a thread. However, this is not a typical queueing model because a server can terminate its operation, depending on when other servers complete their service (due to the redundancy that is spread across the threads). Hence, to the best of our knowledge, the analysis of this queueing model remains quite uncharted.Real traces from Amazon S3 show that the time to retrieve a fixed size chunk is random and can be accurately approximated as an i.i.d. exponentially distributed random variable. We show that any work-conserving scheme is delay-optimal when k = 1. When k > 1, we find that a simple greedy scheme, which allocates all available threads to the head of line request, is delay optimal, which appears surprising.
Our paper presents solutions using erasure coding, parallel connections to storage cloud and limited chunking (i.e., dividing the object into a few smaller segments) together to significantly improve the delay performance of uploading and downloading data in and out of cloud storage.TOFEC is a strategy that helps front-end proxy adapt to level of workload by treating scalable cloud storage (e.g. Amazon S3) as a shared resource requiring admission control. Under light workloads, TOFEC creates more smaller chunks and uses more parallel connections per file, minimizing service delay. Under heavy workloads, TOFEC automatically reduces the level of chunking (fewer chunks with increased size) and uses fewer parallel connections to reduce overhead, resulting in higher throughput and preventing queueing delay. Our trace-driven simulation results show that TOFEC's adaptation mechanism converges to an appropriate code that provides the optimal delay-throughput trade-off without reducing system capacity. Compared to a non-adaptive strategy optimized for throughput, TOFEC delivers 2.5× lower latency under light workloads; compared to a non-adaptive strategy optimized for latency, TOFEC can scale to support over 3× as many requests.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.