Control of systems with flexible multi-server pools: a shadow routing approach

Stolyar, Alexander L.; Tezcan, Tolga

doi:10.1007/s11134-010-9183-0

Cited by 41 publications

(50 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The design and performance of flexible server systems has been studied in [6,[31][32][33]. In a flexible server system, traditionally, when a server becomes available it chooses the queue from which to take its next job according to some policy.…”

Section: Flexible Server Systemsmentioning

confidence: 99%

Queueing with redundant requests: exact analysis

et al. 2016

View full text Add to dashboard Cite

Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a request on multiple servers and wait for the first completion (discarding all remaining copies of the request). However, there is no exact analysis of systems with redundancy. This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution of the state of the system. In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the "gain" to redundant classes and "pain" to non-redundant classes caused by redundancy. We find some surprising results. First, the response time of a fully redundant class follows a simple exponential distribution and that of the non-redundant class follows a generalized hyperexponential. Second, fully redundant classes are "immune" to any pain caused by other classes becoming redundant. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and join-the-shortest-queue (JSQ) routing of a class. We find that, in many cases, redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution.

show abstract

Section: Flexible Server Systemsmentioning

confidence: 99%

Queueing with redundant requests: exact analysis

et al. 2016

View full text Add to dashboard Cite

show abstract

“…This was also the motivation for the approach taken in Atar and Shwartz [4], that relies on partial sampling from the service time distribution, demonstrating how nearly optimal blind policies can be constructed based on the size of the server population. Recent work of Stolyar and Tezcan [11] provides an alternative look, proposing a robust routing scheme for a multi-buffer multi-pool setting in the Halfin-Whitt regime, which optimally balances load on the server pools without the knowledge of the input rates.…”

Section: Introductionmentioning

confidence: 99%

A blind policy for equalizing cumulative idleness

2011

View full text Add to dashboard Cite

We consider a system with a single queue and multiple server pools of heterogenous exponential servers. The system operates under a policy that always routes a job to the pool with longest cumulative idleness among pools with available servers, in an attempt to achieve fairness toward servers. It is easy to find examples of a system with a fixed number of servers, for which fairness is not achieved by this policy in any reasonable sense. Our main result shows that in the many-server regime of Halfin and Whitt, the policy does attain equalization of cumulative idleness, and that the equalization time, defined within any given precision level, remains bounded in the limit. An important feature of this policy is that it acts 'blindly', in that it requires no information on the service or arrival rates.

show abstract

“…Shadow routing approach has been applied previously to large-scale service systems, but without packing constraints [14]. A distinct feature of our work, and one of its main contributions, is that we demonstrate that packing constraints can be incorporated into the shadow routing framework, and moreover, it can be done in a computationally efficient way, amenable to practical implementations.…”

Section: Introductionmentioning

confidence: 85%

Shadow-Routing Based Dynamic Algorithms for Virtual Machine Placement in a Network Cloud

Guo

Stolyar

Walid³

2018

IEEE Trans. Cloud Comput.

Self Cite

View full text Add to dashboard Cite

Abstract-We consider a shadow routing based approach to the problem of real-time adaptive placement of virtual machines (VM) in large data centers (DC) within a network cloud. Such placement in particular has to respect vector packing constraints on the allocation of VMs to host physical machines (PM) within a DC, because each PM can potentially serve multiple VMs simultaneously. Shadow routing is attractive in that it allows a large variety of system objectives and/or constraints to be treated within a common framework (as long as the underlying optimization problem is convex). Perhaps even more attractive feature is that the corresponding algorithm is very simple to implement, it runs continuously, and adapts automatically to changes in the VM demand rates, changes in system parameters, etc., without the need to re-solve the underlying optimization problem "from scratch". In this paper we focus on the minmax-DC-load problem. Namely, we propose a combined VM-to-DC routing and VM-to-PM assignment algorithm, referred to as Shadow scheme, which minimizes the maximum of appropriately defined DC utilizations. We prove that the Shadow scheme is asymptotically optimal (as one of its parameters goes to 0). Simulation confirms good performance and high adaptivity of the algorithm. Favorable performance is also demonstrated in comparison with a baseline algorithm based on VMware implementation [7], [8]. We also propose a simplified -"more distributed" -version of the Shadow scheme, which performs almost as well in simulations.

show abstract

Control of systems with flexible multi-server pools: a shadow routing approach

Cited by 41 publications

References 39 publications

Queueing with redundant requests: exact analysis

Queueing with redundant requests: exact analysis

A blind policy for equalizing cumulative idleness

Shadow-Routing Based Dynamic Algorithms for Virtual Machine Placement in a Network Cloud

Contact Info

Product

Resources

About