The model is a service system, consisting of several large server pools. A server processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the customer total arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed.We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in [11] for the single-router model. Under PULL algorithm, when a server becomes idle it send a "pull-message" to a randomly uniformly selected router; each router operates independently -it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system, otherwise.Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under sub-critical system load, we prove asymptotic optimality of PULL: as n → ∞, the steady-state probability of an arriving customer experiencing blocking or waiting, vanishes. Furthermore, PULL has an extremely low router-server message exchange rate of one message per customer. These results generalize some of the single-router results in [11].
AMS 2000 Subject Classification: 90B15, 60K25• An algorithm should be oblivious of the server types as much as possible.• An algorithm should allow a distributed implementation, where assignment (routing) of jobs to servers is done by multiple routers, each handling a fraction of demand. This is because having a single router to handle a massive demand may be infeasible or impractical (see [7]).• The router-server signaling overhead should be small. Motivated by the challenges described above, in this paper we consider the following model. It is a service system, consisting of several large server pools. The system is heterogeneous in that a server processing speed and buffer size depend on the pool; the buffer size is the maximum number of jobs -or, customersthat can be queued at the server, and it can be finite or infinite. There is a finite number of routers. The flow of customer arrivals into the system is split equally among the routers, which must assign (or, route) "their" customers to the servers immediately upon arrival. This model is a generalization of the single-router model in [11].We define and study (two versions of) a pull-based customer assignment (routing) algorithm PULL. (One of the algorithm versions is a generalization of the single-router PULL algorithm in [11].) Under this algorithm, when a server becomes idle it sends a "pull-message" to a randomly uniformly selected router; each router operates independently -it assigns an arriving customer to a server according to a randomly uniformly chosen ava...