2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS) 2018
DOI: 10.1109/icdcs.2018.00096
|View full text |Cite
|
Sign up to set email alerts
|

ROSE: Cluster Resource Scheduling via Speculative Over-Subscription

Abstract: Abstract-A long-standing challenge in cluster scheduling is to achieve a high degree of utilization of heterogeneous resources in a cluster. In practice there exists a substantial disparity between perceived and actual resource utilization. A scheduler might regard a cluster as fully utilized if a large resource request queue is present, but the actual resource utilization of the cluster can be in fact very low. This disparity results in the formation of idle resources, leading to inefficient resource usage an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 39 publications
(24 citation statements)
references
References 17 publications
0
24
0
Order By: Relevance
“…The procedure allows for quick resource adjustment used in Perph and makes it free from RM temporarily. This is largely backed by our previous work [30]. To synchronize resource usage, NM will coordinate the oversubscribed and preempted resource slices with RM for timely updating.…”
Section: B Adaptive Isolation and Executionmentioning
confidence: 90%
See 2 more Smart Citations
“…The procedure allows for quick resource adjustment used in Perph and makes it free from RM temporarily. This is largely backed by our previous work [30]. To synchronize resource usage, NM will coordinate the oversubscribed and preempted resource slices with RM for timely updating.…”
Section: B Adaptive Isolation and Executionmentioning
confidence: 90%
“…In the future, we intend to federate individual agents and coordinate the model learning. We also plan to integrate Perph mechanism with our previous work on resource over-subscription [30] to supervise the workload co-location considering both LRA's runtime performance and batch job's throughput.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Unfortunately, if the forecast on the next following task is incorrect, the reservation becomes a kind of resource waste. Introducing overbooking technology [37] can cope with the problem and improve the tolerance of offloading architecture for inaccurate estimation.…”
Section: Scalability and Availabilitymentioning
confidence: 99%
“…Current cluster managers [5] [6] [7][8] [9] are designed for short-running tasks within batch jobs, whose performance is minimally affected when launching additional tasks. The central resource manager (RM) is application-agnostic and completely unaware of runtime QoS requirements of interactive and latency-sensitive applications; RM is only responsible for resource allocation among jobs but leaves all application-specific logic to application managers.…”
Section: Renyu Yang Is the Corresponding Authormentioning
confidence: 99%