Proceedings of the International Conference on Supercomputing 2017
DOI: 10.1145/3079079.3079092
|View full text |Cite
|
Sign up to set email alerts
|

Design and implementation of bandwidth-aware memory placement and migration policies for heterogeneous memory systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(10 citation statements)
references
References 24 publications
0
10
0
Order By: Relevance
“…Dynamic data placement has been employed to enable high performance on heterogeneous memory [2,12,22,24,46,57,58,61,76,78,79,82,86,87]. Most of those solutions are application agnostic, which means that they track page (or data) access frequency [2,12,22,24,78,79,82,87] or manage DRAM as a hardware cache for PMM [46,57,76,86] without the knowledge of data semantics. However, the data semantics gives critical indications on memory access patterns, which is useful to direct data placement and avoid unnecessary data movement.…”
Section: Dynamic Data Placement Based On Data Semanticsmentioning
confidence: 99%
See 1 more Smart Citation
“…Dynamic data placement has been employed to enable high performance on heterogeneous memory [2,12,22,24,46,57,58,61,76,78,79,82,86,87]. Most of those solutions are application agnostic, which means that they track page (or data) access frequency [2,12,22,24,78,79,82,87] or manage DRAM as a hardware cache for PMM [46,57,76,86] without the knowledge of data semantics. However, the data semantics gives critical indications on memory access patterns, which is useful to direct data placement and avoid unnecessary data movement.…”
Section: Dynamic Data Placement Based On Data Semanticsmentioning
confidence: 99%
“…Effectively placing data objects of an SpTCSeq in DRAM and PMM for high performance is critical to use PMM to address the memory capacity problem faced by SpTCSeq. To decide data placement on HM, the traditional solutions track page (or data) access frequency [2,12,22,24,58,61,78,79,82,87] or manage DRAM as a hardware cache for PMM [46,57,76,86], and then reactively place frequently accessed data objects into DRAM subject to the DRAM capacity constraint. However, those solutions are application-agnostic, and cause unnecessary and frequent data movement because of short-term variance in memory access patterns.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, a third placement policy is based on the observation that, with a pure fill DRAM first strategy, bandwidthintensive workloads might saturate DRAM bandwidth while not taking advantage of the available PM bandwidth. Hence, a bandwidth balance strategy tries to distribute hot pages across DRAM and PM, in some appropriate ratio, with the goal of maximizing the aggregate bandwidth that applications can attain when accessing different pages in parallel [60], [30].…”
Section: Dram+pm Memory Hierarchiesmentioning
confidence: 99%
“…In contrast to uniform-workers, BWAP takes the asymmetric BWs of every node into account to determine and enforce an optimized application-specific weighted interleaving. Our proposal is inspired by recent research for hybrid memory systems [11], [23], [43]. These works have shown that, when a CPU (or GPU [23]) is served by different memory technologies (such as NVRAM or DRAM) with differing BWs, an optimal placement is one that (proportionally) place fewer pages at the lower-BW memories.…”
mentioning
confidence: 99%
“…The same NUMA memory node may be accessible through different BWs by different threads, depending on each thread's location within the NUMA topology. This implies that optimizing page interleaving from the perspective of a given worker node (as done by the recent proposals for hybrid systems [11], [23], [43]) will not always yield the best overall performance. Instead, the optimization problem needs to consider a complex W ×N BW matrix, where W and N denote the number of worker nodes and total nodes, respectively.…”
mentioning
confidence: 99%