In this paper, we describe CACTI-IO, an extension to CACTI that includes power, area, and timing models for the IO and PHY of the OFF-chip memory interface for various server and mobile configurations. CACTI-IO enables design space exploration of the OFF-chip IO along with the dynamic random access memory and cache parameters. We describe the models added and four case studies that use CACTI-IO to study the tradeoffs between memory capacity, bandwidth (BW), and power. The case studies show that CACTI-IO helps to: 1) provide IO power numbers that can be fed into a system simulator for accurate power calculations; 2) optimize OFF-chip configurations including the bus width, number of ranks, memory data width, and OFF-chip bus frequency, especially for novel buffer-based topologies; and 3) enable architects to quickly explore new interconnect technologies, including 3-D interconnect. We find that buffers on board and 3-D technologies offer an attractive design space involving power, BW, and capacity when appropriate interconnect parameters are deployed.Index Terms-CACTI, CACTI-IO, dynamic random access memory (DRAM), IO, memory interface, power and timing models.
Historically, server designers have opted for simple memory systems by picking one of a few commoditized DDR memory products. We are already witnessing a major upheaval in the off-chip memory hierarchy, with the introduction of many new memory products-buffer-on-board, LRDIMM, HMC, HBM, and NVMs, to name a few. Given the plethora of choices, it is expected that different vendors will adopt different strategies for their high-capacity memory systems, often deviating from DDR standards and/or integrating new functionality within memory systems. These strategies will likely differ in their choice of interconnect and topology, with a significant fraction of memory energy being dissipated in I/O and data movement. To make the case for memory interconnect specialization, this paper makes three contributions. First, we design a tool that carefully models I/O power in the memory system, explores the design space, and gives the user the ability to define new types of memory interconnects/topologies. The tool is validated against SPICE models, and is integrated into version 7 of the popular CACTI package. Our analysis with the tool shows that several design parameters have a significant impact on I/O power. We then use the tool to help craft novel specialized memory system channels. We introduce a new relay-onboard chip that partitions a DDR channel into multiple cascaded channels. We show that this simple change to the channel topology can improve performance by 22% for DDR DRAM and lower cost by up to 65% for DDR DRAM. This new architecture does not require any changes to DIMMs, and it efficiently supports hybrid DRAM/NVM systems. Finally, as an example of a more disruptive architecture, we design a custom DIMM and parallel bus that moves away from the DDR3/DDR4 standards. To reduce energy and improve performance, the baseline data channel is split into three narrow parallel channels and the on-DIMM interconnects are operated at a lower frequency. In addition, this allows us to design a two-tier error protection strategy that reduces data transfers on the interconnect. This architecture yields a performance improvement of 18% and a memory power reduction of 23%. The cascaded channel and narrow channel architectures serve as case studies for the new tool and show the potential for benefit from reorganizing basic memory interconnects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.