2019
DOI: 10.1145/3310332
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Data Supply for Parallel Heterogeneous Architectures

Abstract: Decoupling techniques have been proposed to reduce the amount of memory latency exposed to highperformance accelerators as they fetch data. Although decoupled access-execute (DAE) and more recent decoupled data supply approaches offer promising single-threaded performance improvements, little work has considered how to extend them into parallel scenarios. This article explores the opportunities and challenges of designing parallel, high-performance, resource-efficient decoupled data supply systems. We propose … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 57 publications
0
2
0
Order By: Relevance
“…area-constrained environments tend to use simple in-order cores. (3) Techniques that rely on ISA extensions [23] or ISA-specific instructions have limited applicability and portability problems, especially in the context of heterogeneous-ISA architectures [7,34]. (4) Hardware-only techniques like Slipstream [52,54] or hardware prefetching [1,26] often require costly structures for book-keeping, detection, and prediction.…”
Section: Background and Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…area-constrained environments tend to use simple in-order cores. (3) Techniques that rely on ISA extensions [23] or ISA-specific instructions have limited applicability and portability problems, especially in the context of heterogeneous-ISA architectures [7,34]. (4) Hardware-only techniques like Slipstream [52,54] or hardware prefetching [1,26] often require costly structures for book-keeping, detection, and prediction.…”
Section: Background and Motivationmentioning
confidence: 99%
“…This also allows a process to decide at runtime which MAPLE unit to target. As we introduced before, previous approaches [22,23,49] do not offer this software programmability of decouplinghardware resources, as these are tightly connected to specific cores.…”
Section: Communicating With Maple Unitsmentioning
confidence: 99%