Speeding up distributed request-response workflows

Jalaparti, Virajith; Bodík, Peter; Kandula, Srikanth; Menache, Ishai; Rybalkin, Mikhail; Yan, Chenyu

doi:10.1145/2486001.2486028

Cited by 92 publications

(107 citation statements)

References 18 publications

Supporting

Mentioning

105

Contrasting

Order By: Relevance

“…Some of the above task characteristics (e.g., large number of concurrent flows) also contribute towards network congestion (and losses), which in turn, results in increased response times for the users. This has even been observed in production data centers (e.g., Bing [4,18], Cosmos [7], Facebook [22]) which typically have modest average data center utilization. Thus, the network, and its resource allocation policy, play an important role in providing good performance to data center applications.…”

Section: Task-oriented Applicationsmentioning

confidence: 94%

“…For each query, the task size is the sum of flows sizes across all workers involved in the query. The figure reflects the analysis of roughly 47K queries based on datasets collected in [18]. While most tasks have the same size, approximately 15% of the tasks are significantly heavier than others.…”

Section: Task-oriented Applicationsmentioning

confidence: 99%

“…Many prior proposals attempt to improve task completion times through various straggler mitigation techniques (e.g., re-issuing the request) [7,18]. These techniques are orthogonal to our work as they focus on non-scheduling delays, such as delays caused by slow machines or failures, while we focus on the delays due to the resource sharing policy.…”

Section: Task-aware Schedulers and Network Abstractionsmentioning

confidence: 99%

“…To demonstrate the feasibility and benefits of Baraat, we evaluate it on three platforms: a small-scale testbed for validating our proof-of-concept prototype; a flow based simulator for conducting large-scale experiments based on workloads from Bing [18] and data-analytics applications [9]; the ns-2 simulator for conducting micro-benchmarks. We have also integrated the popular in-memory caching application, Memcached [2], with Baraat.…”

Section: Introductionmentioning

confidence: 99%

“…We are also grateful to the Kwiken [18] team, especially Ishai Menache and Virajith Jalaparti, for sharing their Bing traces.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Decentralized task-aware scheduling for data center networks

Dogar

Karagiannis

Ballani

et al. 2014

SIGCOMM Comput. Commun. Rev.

109

View full text Add to dashboard Cite

Many data center applications perform rich and complex tasks (e.g., executing a search query or generating a user's news-feed). From a network perspective, these tasks typically comprise multiple flows, which traverse different parts of the network at potentially different times. Most network resource allocation schemes, however, treat all these flows in isolation -rather than as part of a task -and therefore only optimize flow-level metrics.In this paper, we show that task-aware network scheduling, which groups flows of a task and schedules them together, can reduce both the average as well as tail completion time for typical data center applications. To achieve these benefits in practice, we design and implement Baraat, a decentralized task-aware scheduling system. Baraat schedules tasks in a FIFO order but avoids head-of-line blocking by dynamically changing the level of multiplexing in the network. Through experiments with Memcached on a small testbed and large-scale simulations, we show that Baraat outperforms state-of-the-art decentralized schemes (e.g., pFabric) as well as centralized schedulers (e.g., Orchestra) for a wide range of workloads (e.g., search, analytics, etc).

show abstract

Section: Task-oriented Applicationsmentioning

confidence: 94%

Section: Task-oriented Applicationsmentioning

confidence: 99%

Section: Task-aware Schedulers and Network Abstractionsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

“…We are also grateful to the Kwiken [18] team, especially Ishai Menache and Virajith Jalaparti, for sharing their Bing traces.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Decentralized task-aware scheduling for data center networks

Dogar

Karagiannis

Ballani

et al. 2014

SIGCOMM Comput. Commun. Rev.

109

View full text Add to dashboard Cite

show abstract

TAP: Timeliness‐aware predication‐based replica selection algorithm for key‐value stores

Zhou

Fang

Xie

et al. 2019

Concurrency and Computation

View full text Add to dashboard Cite

In current large-scale distributed key-value stores, a single end-user request may lead to key-value access across tens or hundreds of servers. The tail latency of these key-value accesses is crucial to user experience and greatly impacts the revenue. To cut the tail latency, it is crucial for clients to choose the best replica server as much as possible for the service of each key-value access operation. Aware of the challenges on the time-varying performance across servers and the herd behaviors, an adaptive replica selection scheme C3 has been proposed recently. In C3, feedback from individual servers is brought into replica ranking to reflect the time-varying performance of servers, and the distributed rate control and backpressure mechanisms are invented. Despite C3's good performance, we reveal the timeliness issue of C3, which has large impacts on both the replica ranking and the rate control. To address this issue, we propose the TAP (timeliness-aware predication-based) replica selection algorithm, which predicts the queue size of replica servers under the poor timeliness condition, instead of utilizing the exponentially weighted moving average of the piggybacked queue sizes in history as in C3. Consequently, compared with C3, TAP can obtain more accurate queue-size estimation to guide the replica selection at clients. Simulation results also confirm the advantage of TAP over C3 in terms of cutting the tail latency.KEYWORDS key-value stores, prediction, tail latency, timeliness INTRODUCTIONIn the current large-scale distributed key-value store system, data are partitioned into small pieces, replicated, and distributed across servers for parallel access and scalability. Consequently, a single end-user request may need key-value access from tens or hundreds of servers. 1-3 The tail latency of these key-value accesses decides the response time of the end-user request, which is directly associated with user experience and revenue. 4,5 Nevertheless, because the performance of servers is time varying, 6,7 the tail latency is hard to be guaranteed and may become long beyond expectation in a certain condition. A recent study shows that the 99th percentile latency can be one order of magnitude larger than the median latency, 6 indicating that there is a large space to cut the tail latency of key-value accesses. To cut the tail latency, the replica selection scheme, which chooses the best replica server for each key-value access as much as possible at clients, is crucial. 8 Other methods, including duplicate or reissue requests 2,6,9,10 for small tail latency, can also benefit from a good replica selection scheme.However, the replica selection schemes of current classic key-value stores are very simple for efficiency. For example, the OpenStack Swift just reads from an arbitrary server and retries in case of failures. 11 HBase relies on HDFS, which chooses the physically closest replica server. 12Riak uses an external load balancer such as Nginx, 13 which employs the least-outstanding requests (LOR) strategy. Accor...

show abstract

TailX: Scheduling Heterogeneous Multiget Queries to Improve Tail Latencies in Key-Value Stores

Jaiman

Mokhtar

Rivière

2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Users of interactive services such as e-commerce platforms have high expectations for the performance and responsiveness of these services. Tail latency, denoting the worst service times, contributes greatly to user dissatisfaction and should be minimized. Maintaining low tail latency for interactive services is challenging because a request is not complete until all its operations are completed. The challenge is to identify bottleneck operations and schedule them on uncoordinated backend servers with minimal overhead, when the duration of these operations are heterogeneous and unpredictable. In this paper, we focus on improving the latency of multiget operations in cloud data stores. We present TailX, a task-aware multiget scheduling algorithm that improves tail latencies under heterogeneous workloads. TailX schedules operations according to an estimation of the size of the corresponding data, and allows itself to procrastinate some operations to give way to higher priority ones. We implement TailX in Cassandra, a widely used key-value store. The result is an improved overall performance of the cloud data stores for a wide variety of heterogeneous workloads. Specifically, our experiments under heterogeneous YCSB workloads show that TailX outperforms state-ofthe-art solutions and reduces tail latencies by up to 70% and median latencies by up to 75%.

show abstract

Speeding up distributed request-response workflows

Cited by 92 publications

References 18 publications

Decentralized task-aware scheduling for data center networks

Decentralized task-aware scheduling for data center networks

TAP: Timeliness‐aware predication‐based replica selection algorithm for key‐value stores

TailX: Scheduling Heterogeneous Multiget Queries to Improve Tail Latencies in Key-Value Stores

Contact Info

Product

Resources

About