2020
DOI: 10.48550/arxiv.2012.09646
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime

Abstract: GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using GPUs to their full capabilities requires expert knowledge of asynchronous programming. In this work, we present a novel GPU run time scheduler for multi-task GPU computations that transparently provides asynchronous execution, space-sharing, and transfer-computation overlap w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…CUDA Graph [5] binds, but not fuses, GPU kernels to reduce kernel launch overhead, which still suffers from off-chip memory traffic. Furthermore, it results in high GPU memory consumption to store all the graph metadata of every kernel [35]. AStitch does not have these problems and explores a larger optimization scope beyond CUDA Graph.…”
Section: Related Workmentioning
confidence: 99%
“…CUDA Graph [5] binds, but not fuses, GPU kernels to reduce kernel launch overhead, which still suffers from off-chip memory traffic. Furthermore, it results in high GPU memory consumption to store all the graph metadata of every kernel [35]. AStitch does not have these problems and explores a larger optimization scope beyond CUDA Graph.…”
Section: Related Workmentioning
confidence: 99%