Common schedulers for long-term running services that perform task-level optimization fail to accommodate short-living batch processing (BP) jobs. Thus, many efficient job-level scheduling strategies are proposed for BP jobs. However, the existing scheduling strategies perform time-consuming objective optimization which yields non-negligible scheduling delay. Moreover, they tend to assign BP jobs in a centralized manner to reduce monetary cost and synchronization overhead, which can easily cause resource contention due to the task co-location. To address these problems, this paper proposes TEBAS, a time-efficient balance-aware scheduling strategy, which spreads all tasks of a BP job into the cluster according to the resource specifications of a single task based on the observation that computing tasks of a BP job commonly possess similar features. The experimental results show the effectiveness of TEBAS in terms of scheduling efficiency and load balancing performance.
The past decade witnessed a remarkable increase in deep learning (DL) workloads which require GPU resources to accelerate the training process. However, the existing coarse‐grained scheduling mechanisms are agnostic to information other than the number of GPUs or GPU memory, which results in performance degradation of DL tasks. Moreover, the common assumption held by the existing balance‐aware DL task scheduling strategies, a DL task consumes resources once it starts, fails to reduce resource contention, and further limits execution efficiency. To address these problems, this article proposes a fine‐grained and balance‐aware scheduling model (FBSM) which considers the resource consumption characteristic of the DL task. Based on FBSM, we propose customized GPU sniffer (GPU‐S) and balance‐aware scheduler (BAS) modules to construct a scheduling system called KubFBS. The experimental results demonstrate KubFBS accelerates the execution of DL tasks while improving the load balancing capability of the cluster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.