Debankur Mukherjee scite author profile

We consider a system of N parallel single-server queues with unit exponential service rates and a single dispatcher where tasks arrive as a Poisson process of rate λ(N). When a task arrives, the dispatcher assigns it to a server with the shortest queue among d(N) randomly selected servers (1 d(N) N). This load balancing strategy is referred to as a JSQ(d(N)) scheme, marking that it subsumes the celebrated Join-the-Shortest Queue (JSQ) policy as a crucial special case for d(N) = N.We construct a stochastic coupling to bound the difference in the queue length processes between the JSQ policy and a JSQ(d(N)) scheme with an arbitrary value of d(N). We use the coupling to derive the fluid limit in the regime where λ(N)/N → λ < 1 as N → ∞ with d(N) → ∞, along with the associated fixed point. The fluid limit turns out not to depend on the exact growth rate of d(N), and in particular coincides with that for the JSQ policy. We further leverage the coupling to establish that the diffusion limit in the critical regime where (N − λ(N))/ √ N → β > 0 as N → ∞ with d(N)/( √ N log(N)) → ∞ corresponds to that for the JSQ policy. These results indicate that the optimality of the JSQ policy can be preserved at the fluid-level and diffusion-level while reducing the overhead by nearly a factor O(N) and O( √ N/ log(N)), respectively. * d.mukherjee@tue.nl

show abstract

Universality of load balancing schemes on the diffusion scale

Mukherjee

Borst

Leeuwaarden

et al. 2016

J. Appl. Probab.

View full text Add to dashboard Cite

We consider a system of N parallel queues with identical exponential service rates and a single dispatcher where tasks arrive as a Poisson process. When a task arrives, the dispatcher always assigns it to an idle server, if there is any, and to a server with the shortest queue among d randomly selected servers otherwise (1 ≤ d ≤ N ). This load balancing scheme subsumes the so-called Join-the-Idle Queue (JIQ) policy (d = 1) and the celebrated Join-the-Shortest Queue (JSQ) policy (d = N ) as two crucial special cases. We develop a stochastic coupling construction to obtain the diffusion limit of the queue process in the Halfin-Whitt heavy-traffic regime, and establish that it does not depend on the value of d, implying that assigning tasks to idle servers is sufficient for diffusion level optimality. * Corresponding author: d.mukherjee@tue.nl Load balancing schemes can be broadly categorized as static (open-loop), dynamic (closed-loop), or some intermediate blend, depending on the amount of real-time feedback or state information (e.g. queue lengths or load measurements) that is used in assigning tasks. Within the category of dynamic policies, one can further distinguish between push-based and pull-based approaches, depending on whether the initiative resides with a dispatcher actively collecting feedback from the servers, or with the servers advertizing their availability or load status. The use of state information naturally allows dynamic policies to achieve better performance and greater resource pooling gains, but also involves higher implementation complexity and potentially substantial communication overhead. The latter issue is particularly pertinent in large-scale data centers, which deploy thousands of servers and handle massive demands, with service requests coming in at huge rates.In the present paper we focus on a basic scenario of N parallel queues with identical servers, exponentially distributed service requirements, and a service discipline at each individual server that is oblivious to the actual service requirements (e.g. FCFS). In this canonical case, the so-called Join-the-Shortest-Queue (JSQ) policy has several strong optimality properties, and in particular minimizes the overall mean delay among the class of non-anticipating load balancing policies that do not have any advance knowledge of the service requirements [3,16,18]. (Relaxing any of the three above-mentioned assumptions tends to break the optimality properties of the JSQ policy, and renders the delay-minimizing policy quite complex or even counter-intuitive, see for instance [5,7,17].)In order to implement the JSQ policy, a dispatcher requires instantaneous knowledge of the queue lengths at all the servers, which may give rise to a substantial communication burden, and may not be scalable in scenarios with large numbers of servers. The latter issue has motivated consideration of so-called JSQ(d) policies, where the dispatcher assigns an incoming task to a server with the shortest queue among d servers selected uniformly at random. Me...

show abstract

Asymptotically Optimal Load Balancing Topologies

Mukherjee

Borst

Leeuwaarden

2018

SIGMETRICS Perform. Eval. Rev.

View full text Add to dashboard Cite

We consider a system of N servers inter-connected by some underlying graph topology G N . Tasks with unit-mean exponential processing times arrive at the various servers as independent Poisson processes of rate λ. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in G N .The above model arises in the context of load balancing in large-scale cloud networks and data centers, and has been extensively investigated in the case G N is a clique. Since the servers are exchangeable in that case, mean-field limits apply, and in particular it has been proved that for any λ < 1, the fraction of servers with two or more tasks vanishes in the limit as N → ∞. For an arbitrary graph G N , mean-field techniques break down, complicating the analysis, and the queue length process tends to be worse than for a clique. Accordingly, a graph G N is said to be N-optimal or √ N-optimal when the queue length process on G N is equivalent to that on a clique on an N-scale or √ N-scale, respectively. We prove that if G N is an Erdős-Rényi random graph with average degree d(N), then with high probability it is N-optimal and √ N-optimal if d(N) → ∞ and d(N)/( √ N log(N)) → ∞ as N → ∞, respectively. This demonstrates that optimality can be maintained at N-scale and √ N-scale while reducing the number of connections by nearly a factor N and √ N/ log(N) compared to a clique, provided the topology is suitably random. It is further shown that if G N contains Θ(N) bounded-degree nodes, then it cannot be N-optimal. In addition, we establish that an arbitrary graph G N is N-optimal when its minimum degree is N − o(N), and may not be N-optimal even when its minimum degree is cN + o(N) for any 0 < c < 1/2. Simulation experiments are conducted for various scenarios to corroborate the asymptotic results. * d.mukherjee@tue.nl; † s.c.borst@tue.nl; ‡ j.s.h.v.leeuwaarden@tue.nl arXiv:1707.05866v2 [math.PR] 6 Apr 2019Related work. The above model has been studied in [11,28], focusing on certain fixed-degree graphs and in particular ring topologies. The results demonstrate that the flexibility to forward tasks to a few neighbors, or even just one, with possibly shorter queues significantly improves the performance in terms of the waiting time and tail distribution of the queue length. This resembles the so-called 'power-of-two' effect in the classical case of a complete graph where tasks are assigned to the shortest queue among d servers selected uniformly at random. As shown by Mitzenmacher [16,17] and Vvedenskaya et al. [31], such a 'power-of-d' scheme provides a huge performance improvement over purely random assignment, even when d = 2, in particular super-exponential tail decay, translating into far better waiting-time performance. Further related

show abstract

Supermarket model on graphs

Budhiraja¹,

Mukherjee²,

Wu³

2019

Ann. Appl. Probab.

View full text Add to dashboard Cite

We consider a variation of the supermarket model in which the servers can communicate with their neighbors and where the neighborhood relationships are described in terms of a suitable graph. Tasks with unit-exponential service time distributions arrive at each vertex as independent Poisson processes with rate λ, and each task is irrevocably assigned to the shortest queue among the one it first appears and its d − 1 randomly selected neighbors. This model has been extensively studied when the underlying graph is a clique in which case it reduces to the well known power-of-d scheme. In particular, results of Mitzenmacher (1996) and Vvedenskaya et al. (1996) show that as the size of the clique gets large, the occupancy process associated with the queue-lengths at the various servers converges to a deterministic limit described by an infinite system of ordinary differential equations (ODE). In this work, we consider settings where the underlying graph need not be a clique and is allowed to be suitably sparse. We show that if the minimum degree approaches infinity (however slowly) as the number of servers N approaches infinity, and the ratio between the maximum degree and the minimum degree in each connected component approaches 1 uniformly, the occupancy process converges to the same system of ODE as the classical supermarket model. In particular, the asymptotic behavior of the occupancy process is insensitive to the precise network topology. We also study the case where the graph sequence is random, with the N-th graph given as an Erdős-Rényi random graph on N vertices with average degree c(N). Annealed convergence of the occupancy process to the same deterministic limit is established under the condition c(N) → ∞, and under a stronger condition c(N)/ ln N → ∞, convergence (in probability) is shown for almost every realization of the random graph. * many more. In the context of load balancing problems on graphs, [12,28] examines the performance on certain fixed-degree graphs and in particular ring topologies. Their results demonstrate that the flexibility to forward tasks to a few neighbors, or even just one, with possibly shorter queues significantly improves the performance in terms of the waiting time and tail distribution of the queue length. This is similar to the power-of-two effect in the setting of cliques, but the re-

show abstract

Join-the-shortest queue diffusion limit in Halfin–Whitt regime: Tail asymptotics and scaling of extrema

Banerjee¹,

Mukherjee²

2019

Ann. Appl. Probab.

View full text Add to dashboard Cite

Consider a system of N parallel single-server queues with unit-exponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate λ(N). When a task arrives, the dispatcher assigns it to one of the servers according to the Join-the-Shortest Queue (JSQ) policy. Eschenfeldt and Gamarnik (2015) established that in the Halfin-Whitt regime where (N − λ(N))/ √ N → β > 0 as N → ∞, appropriately scaled occupancy measure of the system under the JSQ policy converges weakly on any finite time interval to a certain diffusion process as N → ∞. Recently, it was further established by Braverman (2018) that the convergence result extends to the steady state as well, i.e., stationary occupancy measure of the system converges weakly to the steady state of the diffusion process as N → ∞, proving the interchange of limits result.In this paper we perform a detailed analysis of the steady state of the above diffusion process. Specifically, we establish precise tail-asymptotics of the stationary distribution and scaling of extrema of the process on large time interval. Our results imply that the asymptotic steady-state scaled number of servers with queue length two or larger exhibits an Exponential tail, whereas that for the number of idle servers turns out to be Gaussian. From the methodological point of view, the diffusion process under consideration goes beyond the state-of-the-art techniques in the study of the steady state of diffusion processes. Lack of any closed form expression for the steady state and intricate interdependency of the process dynamics on its local times make the analysis significantly challenging. We develop a technique involving the theory of regenerative processes that provides a tractable form for the stationary measure, and in conjunction with several sharp hitting time estimates, acts as a key vehicle in establishing the results. The technique and the intermediate results might be of independent interest, and can possibly be used in understanding the bulk behavior of the process.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.