Optimal Load Balancing in Bipartite Graphs

Weng, Wentao; Zhou, Xingyu; Srikant, R.

doi:10.48550/arxiv.2008.08830

Cited by 6 publications

(13 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main difficulty is that their expressions involve fractions of two sums that both span all states. We would also like to generalize these results in other directions, for instance by considering open variants of the algorithm or by accounting for assignment constraints [7,21].…”

Section: Discussionmentioning

confidence: 95%

“…To achieve good performance without these strong assumptions, more recent works introduced speed-aware variants of the above-mentioned well-known algorithms. More specifically, [21] introduced variants of join-the-shortest-queue and join-idle-queue where the server speeds are used as a tie-breaking rule, and proved that these variants minimize the mean response time in the many-server regime; [10] proposed variants of power-of-d-choices and join-idle-queue for service systems with two server types (fast and slow) by adapting the degree of diversity and assignment probabilities to the server speeds, and proved stability, again in the many-server regime. Despite these advances, we still lack a fundamental understanding of the impact of heterogeneity on performance in service systems with a finite number of servers.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model

Boor

Comte

2021

2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)

View full text Add to dashboard Cite

Efficiently exploiting servers in data centers requires performance analysis methods that account not only for the stochastic nature of demand but also for server heterogeneity. Although several recent works proved optimality results for heterogeneity-aware variants of classical load-balancing algorithms in the many-server regime, we still lack a fundamental understanding of the impact of heterogeneity on performance in finite-size systems. In this paper, we consider a load-balancing algorithm that leads to a product-form queueing model and can therefore be analyzed exactly even when the number of servers is finite. We develop new analytical methods that exploit its product-form stationary distribution to understand the joint impact of the speeds and buffer lengths of servers on performance. These analytical results are supported and complemented by numerical evaluations that cover a large variety of scenarios.

show abstract

Section: Discussionmentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model

Boor

Comte

2021

2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS)

View full text Add to dashboard Cite

show abstract

“…RELATED WORKS Online load balancing policies are fully investigated under classic settings, where multiple identical servers with exponentially distributed service rates process continuous arrived jobs. Based on CTMC and Lyapunov Stability theories, load balancing policies such as JSQ [14], JIQ [11], Pod [12], and JFIQ [9] are proposed and analyzed on the mean response time and cross-server communication overhead. In a most recent work [9], Weng et al proposed the JFSQ and JFIQ policies under the constraints of heterogenous service rates and service locality.…”

Section: B Simulation Resultsmentioning

confidence: 99%

“…Based on CTMC and Lyapunov Stability theories, load balancing policies such as JSQ [14], JIQ [11], Pod [12], and JFIQ [9] are proposed and analyzed on the mean response time and cross-server communication overhead. In a most recent work [9], Weng et al proposed the JFSQ and JFIQ policies under the constraints of heterogenous service rates and service locality. They show that, under a well-connected bipartite graph condition, these two policies can achieve the minimum mean response time in both the many-server regime and the sub Halfin-Whitt regime.…”

Section: B Simulation Resultsmentioning

confidence: 99%

“…Meanwhile, a series of academic studies on online load balancing, which promise long-term performance guarantees, assume that jobs arrive according to Poisson process and service rates of computing instances are exponentially distributed [9]- [13]. Under stochastic ordering assumption, policies such as Join-the-Shortest-Queue (JSQ) [14], Join-the-Idle-Queue (JIQ) [11], Power-of-d-Choices (Pod) [12], and Jointhe-Fastest-of-the-Shortest-Queues (JFIQ) [9] are raised and analyzed based on Continuous-Time Markov Chains (CTMC) and Lyapunov Stability theories. However, their performance guarantees (mostly on mean response time) are established on sufficient assumptions, which are tough to satisfy in production systems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Theoretically Guaranteed Online Workload Dispatching for Deadline-Aware Multi-Server Jobs

Hong-xia¹,

Deng²,

Yin³

et al. 2021

Preprint

View full text Add to dashboard Cite

Serverless computing is leading the way to a simplified and general purpose programming model for the cloud. A key enabler behind serverless is efficient load balancing, which routes continuous workloads to appropriate backend resources. However, current load balancing algorithms implemented in Kubernetes native serverless platforms are simple heuristics without performance guarantee. Although policies such as Pod or JFIQ yield asymptotically optimal mean response time, the information they depend on are usually unavailable. In addition, dispatching jobs with strict deadlines, fractional workloads, and maximum parallelism bound to limited resources online is difficult because the resource allocation decisions for jobs are intertwined. To design an online load balancing algorithm without assumptions on distributions while maximizing the social welfare, we construct several pseudo-social welfare functions and cost functions, where the latter is to estimate the marginal cost for provisioning services to every newly arrived job based on present resource surplus. The proposed algorithm, named OnSocMax, works by following the solutions of several convex pseudo-social welfare maximization problems. It is proved to be α-competitive for some α at least 2. We also validate OnSocMax with simulations and the results show that it distinctly outperforms several handcrafted benchmarks.

show abstract

Job Dispatching Policies for Queueing Systems with Unknown Service Rates

Choudhury

Joshi

Wang

et al. 2021

Proceedings of the Twenty-Second International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Net

View full text Add to dashboard Cite

In multi-server queueing systems where there is no central queue holding all incoming jobs, job dispatching policies are used to assign incoming jobs to the queue at one of the servers. Classic job dispatching policies such as join-the-shortest-queue and shortest expected delay assume that the service rates and queue lengths of the servers are known to the dispatcher. In this work, we tackle the problem of job dispatching without the knowledge of service rates and queue lengths, where the dispatcher can only obtain noisy estimates of the service rates by observing job departures. This problem presents a novel exploration-exploitation trade-off between sending jobs to all the servers to estimate their service rates, and exploiting the currently known fastest servers to minimize the expected queueing delay. We propose a bandit-based exploration policy that learns the service rates from observed job departures. Unlike the standard multi-armed bandit problem where only one out of a finite set of actions is optimal, here the optimal policy requires identifying the optimal fraction of incoming jobs to be sent to each server. We present a regret analysis and simulations to demonstrate the effectiveness of the proposed bandit-based exploration policy. CCS CONCEPTS• Mathematics of computing → Queueing theory; • Networks → Network performance analysis.

show abstract

Optimal Load Balancing in Bipartite Graphs

Cited by 6 publications

References 45 publications

Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model

Load Balancing in Heterogeneous Server Clusters: Insights From a Product-Form Queueing Model

Theoretically Guaranteed Online Workload Dispatching for Deadline-Aware Multi-Server Jobs

Job Dispatching Policies for Queueing Systems with Unknown Service Rates

Contact Info

Product

Resources

About