2019
DOI: 10.1007/978-3-030-29400-7_16
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing the Programmability and Performance Portability of GPU Tensor Operations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Other authors have proposed the idea of unified semantics among different GPGPU APIs targeting OpenCL and CUDA as compilation backends [19], and proposing a framework offering a unified specification with easy-to-use abstractions for managing compute and data resources. Narrowing the application target to tensor operations, other approaches have investigated an abstraction layer for deep neural networks, capable of generating CUDA, OpenCL and Vulkan code [20].…”
Section: Related Workmentioning
confidence: 99%
“…Other authors have proposed the idea of unified semantics among different GPGPU APIs targeting OpenCL and CUDA as compilation backends [19], and proposing a framework offering a unified specification with easy-to-use abstractions for managing compute and data resources. Narrowing the application target to tensor operations, other approaches have investigated an abstraction layer for deep neural networks, capable of generating CUDA, OpenCL and Vulkan code [20].…”
Section: Related Workmentioning
confidence: 99%
“…Scalability reflects the ability to support a massive amount of data [202], [203]. Portability refers to the flexibility of the workload to be transportable across core, edge, and endpoint deployments [204]. Timing describes analyzing streaming databases in a real-time or near-realtime manner by involving advanced computing technologies such as DL accelerators [205], [206].…”
Section: F Communication Infrastructures Protocols and Investmentsmentioning
confidence: 99%
“…Upcoming SoC-FPGAs platforms (e.g., Xilinx Versal) combine these heterogeneous resources, but challenges remain with respect to hardware support for safety-critical systems such as predictable interconnects, avoidance of temporal interference in memory and safety monitors. For example, while the portability to different GPU architectures and programming interfaces was addressed in prior work [195], portability to other resource types and the simultaneous usage of heterogeneous computing resources is also considered a challenge, with few works currently addressing this challenge [196].…”
Section: Heterogeneous Computing Platformsmentioning
confidence: 99%