2017
DOI: 10.1007/978-3-319-64203-1_18
|View full text |Cite
|
Sign up to set email alerts
|

Runtime-Assisted Shared Cache Insertion Policies Based on Re-reference Intervals

Abstract: Abstract. Processor speed is improving at a faster rate than the speed of main memory, which makes memory accesses increasingly expensive. One way to solve this problem is to reduce miss ratio of the processor's last level cache by improving its replacement policy. We approach the problem by co-designing the runtime system and hardware and exploiting the semantics of the applications written in data-flow task-based programming models to provide hardware with information about the task types and task data-depen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…The runtime system can transparently manage GPUs [7], [76], FPGA accelerators [18], [85], multi-node clusters [20], [27], [28], heterogeneous memories [4], [63], scratchpad memories [5], NUMA [81], [82] and cache coherent NUMA [21], [23] systems. Adding hardware support, the runtime system can guide cache replacement [37], [65], cache coherence deactivation [22], cache prefetching [47], [75], cache communication mechanisms in producer-consumer task relationships [64], [66], reliability and resilience [51]- [53], value approximation [19], and DVFS to accelerate critical tasks [26].…”
Section: Task Dataflow Programming Modelsmentioning
confidence: 99%
“…The runtime system can transparently manage GPUs [7], [76], FPGA accelerators [18], [85], multi-node clusters [20], [27], [28], heterogeneous memories [4], [63], scratchpad memories [5], NUMA [81], [82] and cache coherent NUMA [21], [23] systems. Adding hardware support, the runtime system can guide cache replacement [37], [65], cache coherence deactivation [22], cache prefetching [47], [75], cache communication mechanisms in producer-consumer task relationships [64], [66], reliability and resilience [51]- [53], value approximation [19], and DVFS to accelerate critical tasks [26].…”
Section: Task Dataflow Programming Modelsmentioning
confidence: 99%
“…Recent works (Dimić et al 2017;Manivannan et al 2016;Pan and Pai 2015) leverage information about tasks and its working set in task-based programming models to improve efficiency of the shared LLC. However, like other local schemes, they only address cache inefficiency at a single level of the cache hierarchy.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, the runtime system contains by design the information about the parallel code and the underlying hardware, and therefore can serve as an interface between these two layers. The usefulness of runtime system-level information in the context of HPC applications and hardware has been extensively studied [1,4,5,7,10,11,19,24,38].…”
Section: Introductionmentioning
confidence: 99%