Temporal Exposure Reduction Protection for Persistent Memory

Xu, Yuanchao; Ye, Chencheng; Shen, Xipeng; Solihin, Yan

doi:10.1109/hpca53966.2022.00071

Cited by 9 publications

(4 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While sharing some conceptual similarities with the consolidation of small chunks into larger ones, GMLake adopts a stitchingbased technique, which minimizes the need for frequent data movement and copying, resulting in a significant enhancement of memory efficiency. Beyond conventional memory systems, recent research has also explored the defragmentation for persistent memories [84].…”

Section: Related Work and Discussionmentioning

confidence: 99%

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching

Guo,

Zhang,

et al. 2024

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems,

View full text Add to dashboard Cite

Large-scale deep neural networks (DNNs), such as large language models (LLMs), have revolutionized the artificial intelligence (AI) field and become increasingly popular. However, training or fine-tuning such models requires substantial computational power and resources, where the memory capacity of a single acceleration device like a GPU is one of the most important bottlenecks. Owing to the prohibitively large overhead (e.g., 10×) of GPUs' native memory allocator, DNN frameworks like PyTorch and TensorFlow adopt a caching allocator that maintains a memory pool with a splitting mechanism for fast memory (de)allocation. Unfortunately, the caching allocator's efficiency degrades quickly for popular memory reduction techniques such as recomputation, offloading, distributed training, and low-rank adaptation. The primary reason is that those memory reduction techniques introduce frequent and irregular memory * Equal contribution

show abstract

Section: Related Work and Discussionmentioning

confidence: 99%

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching

Guo,

Zhang,

et al. 2024

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems,

View full text Add to dashboard Cite

show abstract

“…However, the fragmentation problem is always accompanied with the data deduplication. We can explore defragmentation technologies [39] to mitigate this problem in an offline manner. On the other hand, since the random read performance of NVMs is much higher than that of SSDs/HDDs (Table 1), the fragmentation is no longer a critical problem for NVMbased storage systems.…”

Section: Discussionmentioning

confidence: 99%

I/O Causality Based In-Line Data Deduplication for Non-Volatile Memory Enabled Storage Systems

Liu,

Jin,

et al. 2024

IEEE Trans. Comput.

View full text Add to dashboard Cite

Data deduplication technologies are widely exploited to reduce capacity demands for storage. Previous chunk-based offline deduplication technologies often cause serious performance overhead due to data chunking and indexing. Particularly, they are not efficient for non-volatile memory (NVM) based storage systems because they cannot fully exploit the byte-addressability feature of NVMs for fine-grained deduplication. In this paper, we propose I/O Causality based In-line Deduplication (ICID) to maximize the deduplication ratio for NVM-based storage systems. Unlike previous inline deduplication schemes that use hash indexes to identify duplicate data slices, ICID records memory-copy operations in a B-tree structure to achieve causality-based inline deduplication. We propose two novel techniques to manage memory-copy records in the B-tree efficiently. First, to speed up the B-tree lookup, we group memory-copy records targeted to the same page in a B-tree node to improve data locality. Second, we exploit the spatial locality of memory accesses to identify outdated memory-copy records, and delete them in time to reduce memory consumption of the B-tree. We evaluate ICID in a system equipped with Intel Optane DC Persistent Memory Modules. For a typical KV store-LevelDB, our experimental results show that ICID achieves up to 16× higher deduplication ratio and reduces the time cost of data deduplication by 47% on average compared with state-of-the-art deduplication schemes.

show abstract

“…Some work proposed faster Merkle tree mechanisms to verify the integrity of PM [10,23,24,24,27,97,98]. Another branch of work reduces the exposure window of PM to reduce the attack surface of PM corruptions [88][89][90], even more so as cross-process attacks are feasible [58].…”

Section: Related Workmentioning

confidence: 99%

FFCCD

Solihin

et al. 2022

Proceedings of the 49th Annual International Symposium on Computer Architecture

Self Cite

View full text Add to dashboard Cite

Persistent Memory (PM) is increasingly supplementing or substituting DRAM as main memory. Prior work have focused on reusability and memory leaks of persistent memory but have not addressed a problem amplified by persistence, persistent memory fragmentation, which refers to the continuous worsening of fragmentation of persistent memory throughout its usage. This paper reveals the challenges and proposes the first systematic crash-consistent solution, Fence-Free Crash-consistent Concurrent Defragmentation (FFCCD). FFCCD resues persistent pointer format, root nodes and typed allocation provided by persistent memory programming model to enable concurrent defragmentation on PM. FFCCD introduces architecture support for concurrent defragmentation that enables a fence-free design and fast read barrier, reducing two major overheads of defragmenting persistent memory. The techniques is effective (28-73% fragmentation reduction) and fast (4.1% execution time overhead). CCS CONCEPTS• Hardware → Non-volatile memory; • Software and its engineering → Garbage collection.

show abstract

Temporal Exposure Reduction Protection for Persistent Memory

Cited by 9 publications

References 48 publications

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching

I/O Causality Based In-Line Data Deduplication for Non-Volatile Memory Enabled Storage Systems

FFCCD

Contact Info

Product

Resources

About