Load Balancing Under Strict Compatibility Constraints

Rutten, Daan; Mukherjee, Debankur

doi:10.1287/moor.2022.1258

Cited by 10 publications

(8 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar models are considered in [29,41]. The former of these two papers broadens the class of graph sequences for which the steady-state fluid limit in [28] holds. For instance, [29] considers certain sequences of spatial graphs that do not satisfy the strong connectivity conditions stated in [28]; yet the number of servers compatible with any given task class goes to infinity.…”

Section: Related Workmentioning

confidence: 95%

“…For the model with strict compatibility constraints, general stability conditions are provided in [7,11]. In addition, [28] assumes that every new task joins the least busy of d compatible servers chosen uniformly at random, and provides connectivity conditions such that the occupancy process has the same process-level and steady-state fluid limit as in the case where the graph is complete bipartite. Similar models are considered in [29,41].…”

Section: Related Workmentioning

confidence: 99%

“…The former of these two papers broadens the class of graph sequences for which the steady-state fluid limit in [28] holds. For instance, [29] considers certain sequences of spatial graphs that do not satisfy the strong connectivity conditions stated in [28]; yet the number of servers compatible with any given task class goes to infinity. On the other hand, [41] extends the model considered in [28] by allowing for heterogeneous service rates and proves process-level and steady-state fluid limits in this setting.…”

Section: Related Workmentioning

confidence: 99%

“…For instance, [29] considers certain sequences of spatial graphs that do not satisfy the strong connectivity conditions stated in [28]; yet the number of servers compatible with any given task class goes to infinity. On the other hand, [41] extends the model considered in [28] by allowing for heterogeneous service rates and proves process-level and steady-state fluid limits in this setting. The model studied in [36,37] also allows for heterogeneous service rates.…”

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Utility Maximizing Load Balancing Policies

2022

View full text Add to dashboard Cite

Consider a service system where incoming tasks are instantaneously dispatched to one out of many heterogeneous server pools. Associated with each server pool is a concave utility function that depends on the class of the server pool and its current occupancy. We derive an upper bound for the mean normalized aggregate utility in stationarity and introduce two load balancing policies that achieve this upper bound in a large-scale regime. Furthermore, the transient and stationary behavior of these asymptotically optimal load balancing policies is characterized on the scale of the number of server pools in the same large-scale regime. Funding: This work was supported by the Netherlands Organization for Scientific Research (NWO) through [Gravitation Grant NETWORKS-024.002.003] and [Gravitation Grant Vici 202.068]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/stsy.2022.0103 .

show abstract

Section: Related Workmentioning

confidence: 95%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Utility Maximizing Load Balancing Policies

2022

View full text Add to dashboard Cite

show abstract

“…Load balancing policies are usually designed based on Continuous-Time Markov Chains (CTMC) and Lyapunov Stability theories. They assume that jobs arrive according to Poisson process and service rates of computing instances are exponentially distributed [17], [18], [36], [37], [38]. As an example, the most classic policy JSQ [16] dispatches each new arrvied job to the shortest queue available.…”

Section: Related Workmentioning

confidence: 99%

Learning to Schedule Multi-Server Jobs with Fluctuated Processing Speeds

Hong-xia¹,

Deng²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

Multi-server jobs are imperative in modern cloud computing systems. A multi-server job has multiple components and requests multiple servers for being served. How to allocate restricted computing devices to jobs is a topic of great concern, which leads to the job scheduling and load balancing algorithms thriving. However, current job dispatching algorithms require the service rates to be changeless and knowable, which is difficult to realize in production systems. Besides, for multi-server jobs, the dispatching decision for each job component follows the All-or-Nothing property under service locality constraints and resource capacity limits, which is not well supported by mainstream algorithms. In this paper, we propose a dispatching algorithm for multi-server jobs that learns the unknown service rates and simultaneously maximizes the expected Accumulative Social Welfare (Asw). We formulate the Asw as the sum of utilities of jobs and servers achieved over each time slot. The utility of a job is proportional to the valuation for being served, which is mainly impacted by the fluctuating but unknown service rates. We maximize the Asw without knowing the exact valuations, but approximate them with exploration-exploitation. From this, we bring in several evolving statistics and maximize the statistical Asw with dynamic programming. The proposed algorithm is proved to have a polynomial complexity and a State-of-the-Art regret. We validate it with extensive simulations and the results show that the proposed algorithm outperforms several benchmark policies with improvements by up to 73%, 36%, and 28%, respectively.

show abstract