Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques 2020
DOI: 10.1145/3410463.3414637
|View full text |Cite
|
Sign up to set email alerts
|

Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory

Abstract: Scratchpad Memory (SPM) is widely used in emerging domain-specific architectures and accelerators for improving energy efficiency and time predictability. Typically, SPM-based architectures use DMA for fetching data from off-chip memory and global load instructions for loading fine-grained data directly into registers. For such architectures, neither capacity-only nor bandwidthonly loop tiling can efficiently use the bandwidth and SPM. This paper introduces a bandwidth-aware loop tiling approach that enables a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 60 publications
0
1
0
Order By: Relevance
“…FlexTensor [66] is a framework that makes use of tiling and vectorization techniques to automatically explore and optimize tensor schedules on heterogeneous (CPU, GPU, and FPGA) systems. Wu et al [62] proposed a runtime tiling framework for scratchpad memories to achieve a balance between bandwidth and space utilization. Rawat et al [50] presented an automatic model-driven approach for code generation on GPUs guided by available resource information to adaptively implement fusion and time tiling.…”
Section: Related Workmentioning
confidence: 99%
“…FlexTensor [66] is a framework that makes use of tiling and vectorization techniques to automatically explore and optimize tensor schedules on heterogeneous (CPU, GPU, and FPGA) systems. Wu et al [62] proposed a runtime tiling framework for scratchpad memories to achieve a balance between bandwidth and space utilization. Rawat et al [50] presented an automatic model-driven approach for code generation on GPUs guided by available resource information to adaptively implement fusion and time tiling.…”
Section: Related Workmentioning
confidence: 99%