RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks With Fine-Grain Utilization

Zou, An; Li, Jing; Gill, Christopher; Zhang, Xuan

doi:10.1109/tpds.2023.3235439

Cited by 10 publications

(1 citation statement)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Persistent kernels can be executed multiple times without launching overheads. Although persistent kernel has already been recognized as an effective approach to improve the real-time performance of GPU kernels, [14,6,6,33,32], its applica-tion to practical applications remains a logistical challenge, as without care-full analysis incorporating this technique can result in high register/memory usage and resource contention.…”

Section: Introductionmentioning

confidence: 99%

A GPU optimization workflow for real-time execution of ultra-high frame rate computer vision applications

Nourazar,

Booth,

Goossens

2023

J Real-Time Image Proc

View full text Add to dashboard Cite

This work proposes a GPU optimization methodology for real-time execution of ultra high frame rate applications with small frame sizes. While the use of GPUs for offline processing is well-established, real-time execution remains challenging due to the lack of real-time execution guarantees, especially for embedded GPUs. Our methodology introduces guidelines and a workflow by focusing on: (a) controlling latency by means of minimization of CPU-GPU interactions; (b) computation pruning; and (c) inter/intrakernel optimizations. Furthermore, our approach takes advantage of multi-frame processing to attain significantly higher throughput at the cost of increased latency when the application permits such trade-offs. To evaluate our optimization methodology, we applied it to the monitoring and controlling of laser powder bed fusion machines, a widely used metal additive manufacturing technique. Results show that in the considered application, the required performance could be obtained on a Jetson Xavier AGX platform, and by sacrificing latency, significantly higher throughput was achieved.

show abstract

Section: Introductionmentioning

confidence: 99%