Persistent memory is a new tier of memory that functions as a hybrid of traditional storage systems and main memory. It combines the benefits of both: the data persistence of storage with the fast load/store interface of memory. Most previous persistent memory designs place careful control over the order of writes arriving at persistent memory. This can prevent caches and memory controllers from optimizing system performance through write coalescing and reordering. We identify that such write-order control can be relaxed by employing undo+redo logging for data in persistent memory systems. However, traditional software logging mechanisms are expensive to adopt in persistent memory due to performance and energy overheads. Previously proposed hardware logging schemes are inefficient and do not fully address the issues in software. To address these challenges, we propose a hardware undo+redo logging scheme which maintains data persistence by leveraging the write-back, write-allocate policies used in commodity caches. Furthermore, we develop a cache forcewrite-back mechanism in hardware to significantly reduce the performance and energy overheads from forcing data into persistent memory. Our evaluation across persistent memory microbenchmarks and real workloads demonstrates that our design significantly improves system throughput and reduces both dynamic energy and memory traffic. It also provides strong consistency guarantees compared to software approaches.
Demand for server memory capacity and performance is rapidly increasing due to expanding working set sizes of modern applications, such as big data analytics, inmemory computing, deep learning, and server virtualization. One promising techniques to tackle this requirements is memory networking, whereby a server memory system consists of multiple 3D die-stacked memory nodes interconnected by a high-speed network. However, current memory network designs face substantial scalability and flexibility challenges. This includes (1) maintaining high throughput and low latency in large-scale memory networks at low hardware cost, (2) efficiently interconnecting an arbitrary number of memory nodes, and (3) supporting flexible memory network scale expansion and reduction without major modification of the memory network design or physical implementation. To address the challenges, we propose String Figure 1 , a highthroughput, elastic, and scalable memory network architecture. String Figure consists of (1) an algorithm to generate random topologies that achieve high network throughput and nearoptimal path lengths in large-scale memory networks, (2) a hybrid routing protocol that employs a mix of computation and look up tables to reduce the overhead of both in routing, (3) a set of network reconfiguration mechanisms that allow both static and dynamic network expansion and reduction. Our experiments using RTL simulation demonstrate that String Figure can interconnect over one thousand memory nodes with a shortest path length within five hops across various traffic patterns and real workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.