Proceedings of the 47th International Conference on Parallel Processing 2018
DOI: 10.1145/3225058.3225070
|View full text |Cite
|
Sign up to set email alerts
|

Learning Driven Parallelization for Large-Scale Video Workload in Hybrid CPU-GPU Cluster

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…When considering large-scale video workloads in hybrid CPU-GPU clusters, the performance degradation often comes from the uncertainty and variability of workloads, and the unbalanced use of heterogeneous resources. To accommodate this, Zhang et al [256] use two deep Q-networks to build a a two-level task scheduler, where the cluster-level scheduler selects proper execution nodes for mutually independent video tasks and the node-level scheduler assigns interrelated video subtasks to appropriate computing units.…”
Section: Data Center Managementmentioning
confidence: 99%
“…When considering large-scale video workloads in hybrid CPU-GPU clusters, the performance degradation often comes from the uncertainty and variability of workloads, and the unbalanced use of heterogeneous resources. To accommodate this, Zhang et al [256] use two deep Q-networks to build a a two-level task scheduler, where the cluster-level scheduler selects proper execution nodes for mutually independent video tasks and the node-level scheduler assigns interrelated video subtasks to appropriate computing units.…”
Section: Data Center Managementmentioning
confidence: 99%
“…Applying ML for system design has a twofold meaning: 1 the reduction of burdens on human experts designing systems manually, and 2 the close of the positive feedback loop, i.e., architecture/system for ML and simultaneously ML for architecture/system, encouraging improvements on both sides. These applications include predictive performance modeling [18,35,44,45,52,56,90], efficient design space exploration [36,38,49,92], cache replacement [5,70,80], prefetcher [8,28,93], branch prediction [25,37], NoC design [21,63,85], power and resource management [4,31], task allocation [51,94], malware detection [15,59], compiler design [53,76], and so on.…”
Section: Related Workmentioning
confidence: 99%
“…Recent work has also explored multi-level scheduling in hybrid CPU-GPU clusters. Zhang et al [84] proposed a deep reinforcement learning (DRL) framework to divide video workloads, first at the cluster level (selecting a worker node) and then at the node level (CPU vs GPU). The two DRL models act separately, but still work together to optimize overall throughput.…”
Section: Task Allocation and Resource Managementmentioning
confidence: 99%