Large-scale heterogeneous service systems with general packing constraints

Stolyar, Alexander L.

doi:10.1017/apr.2016.79

Cited by 16 publications

(10 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, it is shown in [11] that the asymptotic optimality of PULL extends to (single-router) models, where server processing speed depends on its queue length. Another example is recent paper [10], which proposes and studies an algorithm, which can be viewed as a version of PULL, for (single-router) heterogeneous service systems with packing constraints at the servers.…”

Section: Introductionmentioning

confidence: 99%

Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

Stolyar

2016

Queueing Syst

View full text Add to dashboard Cite

The model is a service system, consisting of several large server pools. A server processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the customer total arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed.We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in [11] for the single-router model. Under PULL algorithm, when a server becomes idle it send a "pull-message" to a randomly uniformly selected router; each router operates independently -it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system, otherwise.Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under sub-critical system load, we prove asymptotic optimality of PULL: as n → ∞, the steady-state probability of an arriving customer experiencing blocking or waiting, vanishes. Furthermore, PULL has an extremely low router-server message exchange rate of one message per customer. These results generalize some of the single-router results in [11]. AMS 2000 Subject Classification: 90B15, 60K25• An algorithm should be oblivious of the server types as much as possible.• An algorithm should allow a distributed implementation, where assignment (routing) of jobs to servers is done by multiple routers, each handling a fraction of demand. This is because having a single router to handle a massive demand may be infeasible or impractical (see [7]).• The router-server signaling overhead should be small. Motivated by the challenges described above, in this paper we consider the following model. It is a service system, consisting of several large server pools. The system is heterogeneous in that a server processing speed and buffer size depend on the pool; the buffer size is the maximum number of jobs -or, customersthat can be queued at the server, and it can be finite or infinite. There is a finite number of routers. The flow of customer arrivals into the system is split equally among the routers, which must assign (or, route) "their" customers to the servers immediately upon arrival. This model is a generalization of the single-router model in [11].We define and study (two versions of) a pull-based customer assignment (routing) algorithm PULL. (One of the algorithm versions is a generalization of the single-router PULL algorithm in [11].) Under this algorithm, when a server becomes idle it sends a "pull-message" to a randomly uniformly selected router; each router operates independently -it assigns an arriving customer to a server according to a randomly uniformly chosen ava...

show abstract

Section: Introductionmentioning

confidence: 99%

Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

Stolyar

2016

Queueing Syst

View full text Add to dashboard Cite

show abstract

“…We recall from [34], [35] that an absolutely continuous process y( (19) and ϕ(s) ∈ F (y(s)) for almost every s ∈ I. We say that the DI corresponds to drift function f :…”

Section: Appendix F Review Of Differential Inclusionsmentioning

confidence: 99%

“…This completely changes the analysis and the optimality results. Another algorithm similar to the RAS was analyzed in [19] for a cloud-based systems. However, there the requests can be placed to any server as long as the server has enough bandwidth.…”

Section: Introductionmentioning

confidence: 99%

Asymptotics of Replication and Matching in Large Caching Systems

Mukhopadhyay

Hegde²,

Lelarge

2019

IEEE/ACM Trans. Networking

View full text Add to dashboard Cite

show abstract

“…We do not consider such a scaling regime in this paper since our focus is on a scenarios where only a fixed number of highly popular contents are present at any instant. An online matching policy similar to the proposed matching policy was considered in [10] for cloud computing systems. However, the setting there is completely different as the servers do not have any memory restrictions.…”

Section: Introductionmentioning

confidence: 99%

Optimal Content Replication and Request Matching in Large Caching Systems

Mukhopadhyay¹,

Hegde²,

Lelarge³

2018

IEEE INFOCOM 2018 - IEEE Conference on Computer Communications

View full text Add to dashboard Cite

We consider models of content delivery networks in which the servers are constrained by two main resources: memory and bandwidth. In such systems, the throughput crucially depends on how contents are replicated across servers and how the requests of specific contents are matched to servers storing those contents. In this paper, we first formulate the problem of computing the optimal replication policy which if combined with the optimal matching policy maximizes the throughput of the caching system in the stationary regime. It is shown that computing the optimal replication policy for a given system is an NP-hard problem. A greedy replication scheme is proposed and it is shown that the scheme provides a constant factor approximation guarantee. We then propose a simple randomized matching scheme which avoids the problem of interruption in service of the ongoing requests due to re-assignment or repacking of the existing requests in the optimal matching policy. The dynamics of the caching system is analyzed under the combination of proposed replication and matching schemes. We study a limiting regime, where the number of servers and the arrival rates of the contents are scaled proportionally, and show that the proposed policies achieve asymptotic optimality. Extensive simulation results are presented to evaluate the performance of different policies and study the behavior of the caching system under different service time distributions of the requests.

show abstract

Large-scale heterogeneous service systems with general packing constraints

Cited by 16 publications

References 19 publications

Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

Asymptotics of Replication and Matching in Large Caching Systems

Optimal Content Replication and Request Matching in Large Caching Systems

Contact Info

Product

Resources

About