Stability of JSQ in queues with general server-job class compatibilities

Cruise, James R.; Jonckheere, Matthieu; Shneer, Seva

doi:10.1007/s11134-020-09656-w

Cited by 11 publications

(15 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, their model is not stochastic, and is thus quite different from the model we are considering in this paper. Cruise et al [11] considers the stability of JSQ on the same model as ours while no delay guarantee is provided. In Cardinaels et al [10], redundancy policies are explored in bipartite load balancing.…”

Section: Related Workmentioning

confidence: 99%

“…Gradient Bounds. We now utilizing the explicit form of g(x) in (11) to bound (14). First by definition, it holds that for a state q,…”

Section: Bound For the First K Types Of Serversmentioning

confidence: 99%

“…where we omit the term P k ( Q) from ( 16) since g(x) is an increasing function by (11). Now to simplify the equation, we can do Taylor's expansion on ( 18) and ( 19), and apply gradient bounds of g(x).…”

Section: Bound For the First K Types Of Serversmentioning

confidence: 99%

“…It suffices to bound (56) and (57). First, note that |g (x)| ≤ 1 µ1δ for all x by the explicit form of g(x) in (11). It holds…”

Section: A Proof Of Propositionmentioning

confidence: 99%

“…To the best of our knowledge, this bipartite graph model was only introduced recently in [11], where JSQ is shown to be throughput optimal while no delay performance guarantee is provided. The bipartite graph model generalizes the load balancing model on graphs introduced in [38,8].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Optimal Load Balancing in Bipartite Graphs

Weng,

Zhou,

Srikant

2020

Preprint

View full text Add to dashboard Cite

Applications in cloud platforms motivate the study of efficient load balancing under job-server constraints and server heterogeneity. In this paper, we study load balancing on a bipartite graph where left nodes correspond to job types and right nodes correspond to servers, with each edge indicating that a job type can be served by a server. Thus edges represent locality constraints, i.e., each job can only be served at servers which contained certain data and/or machine learning (ML) models. Servers in this system can have heterogeneous service rates. In this setting, we investigate the performance of two policies named Join-the-Fastest-of-the-Shortest-Queue (JFSQ) and Join-the-Fastest-of-the-Idle-Queue (JFIQ), which are simple variants of Join-the-Shortest-Queue and Jointhe-Idle-Queue, where ties are broken in favor of the fastest servers. Under a "well-connected" graph condition, we show that JFSQ and JFIQ are asymptotically optimal in the mean response time when the number of servers goes to infinity. In addition to asymptotic optimality, we also obtain upper bounds on the mean response time for finite-size systems. We further show that the wellconnectedness condition can be satisfied by a random bipartite graph construction with relatively sparse connectivity. However, the above classical load balancing model may not be appropriate for certain modern cloud computing and data analytic applications due to the presence of job-server constraints. Under such constraints, a job can only be dispatched to a subset of the N servers. These constraints, often called locality constraints, are quite common in large-scale Machine Learning as a Service (MLaaS) and serverless computing services supported by cloud computing

show abstract

Section: Related Workmentioning

confidence: 99%

“…Gradient Bounds. We now utilizing the explicit form of g(x) in (11) to bound (14). First by definition, it holds that for a state q,…”

Section: Bound For the First K Types Of Serversmentioning

confidence: 99%

Section: Bound For the First K Types Of Serversmentioning

confidence: 99%

“…It suffices to bound (56) and (57). First, note that |g (x)| ≤ 1 µ1δ for all x by the explicit form of g(x) in (11). It holds…”

Section: A Proof Of Propositionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Optimal Load Balancing in Bipartite Graphs

Weng,

Zhou,

Srikant

2020

Preprint

View full text Add to dashboard Cite

show abstract

Improving the Performance of Heterogeneous Data Centers through Redundancy

Anton

Ayesta

Jonckheere

et al. 2020

Proc. ACM Meas. Anal. Comput. Syst.

View full text Add to dashboard Cite

We analyze the performance of redundancy in a multi-type job and multi-type server system. We assume the job dispatcher is unaware of the servers' capacities, and we set out to study under which circumstances redundancy improves the performance. With redundancy an arriving job dispatches redundant copies to all its compatible servers, and departs as soon as one of its copies completes service. As a benchmark comparison, we take the non-redundant system in which a job arrival is routed to only one randomly selected compatible server. Service times are generally distributed and all copies of a job are identical, i.e., have the same service requirement. In our first main result, we characterize the sufficient and necessary stability conditions of the redundancy system. This condition coincides with that of a system where each job type only dispatches copies into its least-loaded servers, and those copies need to be fully served. In our second result, we compare the stability regions of the system under redundancy to that of no redundancy. We show that if the server's capacities are sufficiently heterogeneous, the stability region under redundancy can be much larger than that without redundancy. We apply the general solution to particular classes of systems, including redundancy-d and nested models, to derive simple conditions on the degree of heterogeneity required for redundancy to improve the stability. As such, our result is the first in showing that redundancy can improve the stability and hence performance of a system when copies are non-i.i.d..

show abstract

Improving the Performance of Heterogeneous Data Centers through Redundancy

Anton

Ayesta

Jonckheere

et al. 2021

SIGMETRICS Perform. Eval. Rev.

Self Cite

View full text Add to dashboard Cite

We analyze the performance of redundancy in a multi-type job and multi-type server system where PS is implemented. We characterize the stability condition, which coincides with that of a system where each job type only dispatches copies into its least-loaded servers, and those copies need to be fully served. We then investigate the impact of redundancy in the stability condition by comparing that to the stability condition of a non-redundant system in which a job arrival is routed to only one randomly selected compatible server. We observe that if server loads are sufficiently heterogeneous redundancy can considerably improve the stability region of the system.

show abstract

Stability of JSQ in queues with general server-job class compatibilities

Cited by 11 publications

References 18 publications

Optimal Load Balancing in Bipartite Graphs

Optimal Load Balancing in Bipartite Graphs

Improving the Performance of Heterogeneous Data Centers through Redundancy

Improving the Performance of Heterogeneous Data Centers through Redundancy

Contact Info

Product

Resources

About