Speeding up distributed request-response workflows

JalapartiVirajith,; BodikPeter,; KandulaSrikanth,; MenacheIshai,; RybalkinMikhail,; YanChenyu,

doi:10.1145/2534169.2486028

Cited by 57 publications

(68 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The controller code developed in the simulator can be directly plugged into brownout-aware applications like RUBiS 3 and RUBBoS 4 . For the controller implementation, the adaptive PI controller in (14) was discretized with sample period h = 0.5 s using the method suggested in [3], and complemented by a tracking-based anti-windup solution. The parameter estimations that the feedback and feedforward schemes require (Ĝ P (0),Ĝ I (0),λ,α) are implemented as exponentially weighted moving averages according tô…”

Section: A the Simulatormentioning

confidence: 99%

“…Admission control means that some users would not receive any response at all, hence risking losing them to competitors, incurring long-term revenue loss. Another possibility is to assign a maximum time to each request and iteratively refine an answer until the time budget expires [9,14]. This strategy works well for pruning search queries of spurious results, but does not easily generalize to all types of cloud applications.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications

Nylander

Klein

Årzén

et al. 2018

2018 Annual American Control Conference (ACC)

View full text Add to dashboard Cite

Abstract-Cloud computing has emerged as an inexpensive and powerful computing paradigm, to the point that now even applications with hard deadlines are executed in the cloud. It may happen, due to unexpected events, that an application becomes popular and receives a lot of attention and client requests in a short period of time. Provisioning computing capacity for such applications is quite a difficult task, because content popularity cannot be easily predicted. One of the main problems in case content has to be served with a hard deadline is to ensure that this deadline is respected, even in the presence of popularity spikes. To this end, partial computation and graceful degradation were exploited, originating the brownout framework. Applications would degrade the user experience in the presence of load variations, to guarantee that deadlines are met. Two different control paradigms were applied to brownout: discrete-time control of optional content percentage over a period and event-based queue management. The first one had reasonable performance providing formal guarantees about the solution. The second one was able to improve the performance and keep the response time at the setpoint better, but suffered from the drawback of not providing formallygrounded mathematical guarantees. In this work we combine the best of both worlds, providing a cascaded controller for brownout, based on a more precise model of the cloud application with respect to the original design. The Brownout CC controller achieves performance comparable with the eventbased version, without sacrificing formal guarantees.

show abstract

Section: A the Simulatormentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications

Nylander

Klein

Årzén

et al. 2018

2018 Annual American Control Conference (ACC)

View full text Add to dashboard Cite

show abstract

“…A measurement interval seeks to capture the latency of a maximum number n of packets, where n is a constant. Our passive latency measurement is based on the coordinated measurement scheme proposed by Kompella et al [19]: (i) Average, captures the central tendency of latency, which characterizes the long-term latency trend [11]; (ii) Variance, measures how far the latencies are spread out, which correlates with the latency tail: the higher the variance, the worse the long-tail problem [17], [30]. Estimating other metrics like the order statistics such as the maximum delay or the quantiles requires knowledge of the latency value of each packet, unfortunately, the coordinated measurement scheme does not fulfill this requirement as it mixes the latency values of different packets.…”

Section: A Requirementsmentioning

confidence: 99%

Every Timestamp Counts: Accurate Tracking of Network Latencies Using Reconcilable Difference Aggregator

Barlet‐Ros

2018

IEEE/ACM Trans. Networking

View full text Add to dashboard Cite

User-facing services deployed in data centers must respond quickly to user actions. The measurement of network latencies is of paramount importance. Recently, a new family of compact data structures has been proposed to estimate one-way latencies. In order to achieve scalability, these new methods rely on timestamp aggregation. Unfortunately, this approach suffers from serious accuracy problems in the presence of packet loss and reordering, given that a single lost or out-of-order packet may invalidate a huge number of aggregated samples.In this paper, we unify the problem to detect lost and reordered packets within the set reconciliation framework. Although the set reconciliation approach and the data structures for aggregating packet timestamps are previously known, the combination of these two principles is novel. We present a space-efficient synopsis called Reconcilable Difference Aggregator (RDA). RDA maximizes the percentage of useful packets for latency measurement by mapping packets to multiple banks and repairing aggregated samples that have been damaged by lost and reordered packets. RDA simultaneously obtains the average and the standard deviation of the latency. We provide a formal guarantee of the performance and derive optimized parameters. We further design and implement a user-space passive latency measurement system that addresses practical issues of integrating RDA into the network stack. Our extensive evaluation shows that compared to existing methods, our approach improves the relative error of the average latency estimation in 10-15 orders of magnitude, and the relative error of the standard deviation in 0.5-6 orders of magnitude.

show abstract

“…We now discuss relevant characteristics of SOAs using a combination of measurements from a large cloud provider and prior reports on systems from other environments [8, 37,47,48,53,64,70].…”

Section: Soas In Productionmentioning

confidence: 99%

“…First, request execution in SOAs spans tens to hundreds of services, forming a DAG across the service topology [37]. The exact structure of the DAG is often unknown when the request first enters the system, since it depends on multiple factors like the APIs invoked at each encountered service, the supplied arguments, the content of caches, as well as the use of load balancing along the service graph.…”

Section: Introductionmentioning

confidence: 99%

Distributed resource management across process boundaries

Suresh

Bodík

Menache

et al. 2017

Proceedings of the 2017 Symposium on Cloud Computing

Self Cite

View full text Add to dashboard Cite

Multi-tenant distributed systems composed of small services, such as Service-oriented Architectures (SOAs) and Micro-services, raise new challenges in attaining high performance and efficient resource utilization. In these systems, a request execution spans tens to thousands of processes, and the execution paths and resource demands on different services are generally not known when a request first enters the system. In this paper, we highlight the fundamental challenges of regulating load and scheduling in SOAs while meeting end-to-end performance objectives on metrics of concern to both tenants and operators. We design Wisp, a framework for building SOAs that transparently adapts rate limiters and request schedulers systemwide according to operator policies to satisfy end-to-end goals while responding to changing system conditions. In evaluations against production as well as synthetic workloads, Wisp successfully enforces a range of end-to-end performance objectives, such as reducing average latencies, meeting deadlines, providing fairness and isolation, and avoiding system overload.

show abstract

Speeding up distributed request-response workflows

Cited by 57 publications

References 18 publications

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications

Every Timestamp Counts: Accurate Tracking of Network Latencies Using Reconcilable Difference Aggregator

Distributed resource management across process boundaries

Contact Info

Product

Resources

About

Speeding up distributed request-response workflows

Cited by 57 publications

References 18 publications

BrownoutCC: Cascaded Control for Bounding the Response Times of Cloud Applications

BrownoutCC: Cascaded Control for Bounding the Response Times of Cloud Applications

Every Timestamp Counts: Accurate Tracking of Network Latencies Using Reconcilable Difference Aggregator

Distributed resource management across process boundaries

Contact Info

Product

Resources

About

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications

Brownout^CC: Cascaded Control for Bounding the Response Times of Cloud Applications