2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) 2018
DOI: 10.1109/rtas.2018.00028
|View full text |Cite
|
Sign up to set email alerts
|

S^3DNN: Supervised Streaming and Scheduling for GPU-Accelerated Real-Time DNN Workloads

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 63 publications
(34 citation statements)
references
References 28 publications
0
34
0
Order By: Relevance
“…While the well-defined parallel model and the simple kernel software make attractive the use of traditional WCET analysis [42], recent studies showed how GPUs hide many details that can negatively affect the execution time [2] and that it is necessary to develop dedicated WCET analyses [28]. In the opposite view, other works noticed a substantial improvement in the real-time capability of the heterogeneous solution [69] [67]. A hybrid static and measurement-based solution to WCET estimation for GPU N N N 2 CPU N Y N 3 CPU N N Y 4 GPU N N N 5 GPU N Y N 6 CPU Y N N 7 CPU Y Y N 8 CPU Y N Y was proposed by Betts et al [8] in 2013, while the first pWCET approach was presented in 2014 by Berezovskyi et al [6] and its extension [5] in 2016.…”
Section: Heterogeneous Hardware and Predictabilitymentioning
confidence: 99%
“…While the well-defined parallel model and the simple kernel software make attractive the use of traditional WCET analysis [42], recent studies showed how GPUs hide many details that can negatively affect the execution time [2] and that it is necessary to develop dedicated WCET analyses [28]. In the opposite view, other works noticed a substantial improvement in the real-time capability of the heterogeneous solution [69] [67]. A hybrid static and measurement-based solution to WCET estimation for GPU N N N 2 CPU N Y N 3 CPU N N Y 4 GPU N N N 5 GPU N Y N 6 CPU Y N N 7 CPU Y Y N 8 CPU Y N Y was proposed by Betts et al [8] in 2013, while the first pWCET approach was presented in 2014 by Berezovskyi et al [6] and its extension [5] in 2016.…”
Section: Heterogeneous Hardware and Predictabilitymentioning
confidence: 99%
“…Due to increased interest in GPU for accelerating parallel real-time applications, many real-time scheduling frameworks for GPU have been proposed in recent years [27,45,21,37], with a particular focus on DNN acceleration [76,69]. We first review works concerned with kernel scheduling, leaving more directly-related frameworks focusing on memory management to Section 2.3.3.…”
Section: Real-time Framework For Gpumentioning
confidence: 99%
“…We further assume that only one GPU kernel is executed at a time. While recent work has shown that co-scheduling multiple kernels can improve GPU resource utilization [76,39], it also complicates the issue of timing analysis. For this reason, we reserve such an extension to future work.…”
Section: System Model and Assumptionsmentioning
confidence: 99%
“…The operation-to-device partitioning does not require modifications to the TensorFlow internals, and it can be performed with the default Python API for TensorFlow. Zhou et al 17 proposed a pipeline scheduling solution aimed at optimizing the execution of DNN workload on GPUs, while Yang et al 18 identified a combination of techniques to support multiple cameras with an improved throughput in the context of automated-driving systems. In the context of mobile devices, Lane et al 19 proposed two runtime algorithms to decompose a DNN model across available processors with the purpose of improving performance and energy-efficiency.…”
Section: Scheduling Of Dnnmentioning
confidence: 99%