Power dissipation and o↵-chip bandwidth restrictions are critical challenges that limit microprocessor performance. Ternary content addressable memories (TCAM) hold the potential to address both problems in the context of a wide range of data-intensive workloads that benefit from associative search capability. Power dissipation is reduced by eliminating instruction processing and data movement overheads present in a purely RAM based system. Bandwidth demand is lowered by processing data directly on the TCAM chip, thereby decreasing o↵-chip tra c. Unfortunately, CMOSbased TCAM implementations are severely power-and arealimited, which restricts the capacity of commercial products to a few megabytes, and confines their use to niche networking applications.This paper explores a novel resistive TCAM cell and array architecture that has the potential to scale TCAM capacity from megabytes to gigabytes. High-density resistive TCAM chips are organized into a DDR3-compatible DIMM, and are accessed through a software library with zero modifications to the processor or the motherboard. On applications that do not benefit from associative search, the TCAM DIMM is configured to provide ordinary RAM functionality. By tightly integrating TCAM with conventional virtual memory, and by allowing a large fraction of the physical address space to be made content-addressable on demand, the proposed memory system improves average performance by 4⇥ and average energy consumption by 10⇥ on a set of evaluated data-intensive applications.
Conventional off-chip voltage regulators are typically bulky and slow, and are inefficient at exploiting system and workload variability using Dynamic Voltage and Frequency Scaling (DVFS). On-die integration of voltage regulators has the potential to increase the energy efficiency of computer systems by enabling power control at a fine granularity in both space and time. The energy conversion efficiency of on-chip regulators, however, is typically much lower than off-chip regulators, which results in significant energy losses. Finegrained power control and high voltage regulator efficiency are difficult to achieve simultaneously, with either emerging on-chip or conventional off-chip regulators. A voltage conversion framework that relies on a hierarchy of off-chip switching regulators and on-chip linear regulators is proposed to enable fine-grained power control with a regulator efficiency greater than 90%. A DVFS control policy that is based on a reinforcement learning (RL) approach is developed to exploit the proposed framework. Percore RL agents learn and improve their control policies independently, while retaining the ability to coordinate their actions to accomplish system level power management objectives. When evaluated on a mix of 14 parallel and 13 multiprogrammed workloads, the proposed voltage conversion framework achieves 18% greater energy efficiency than a conventional framework that uses on-chip switching regulators. Moreover, when the RL based DVFS control policy is used to control the proposed voltage conversion framework, the system achieves a 21% higher energy efficiency over a baseline oracle policy with coarse-grained power control capability.
Conventional off-chip voltage regulators are typically bulky and slow, and are inefficient at exploiting system and workload variability using Dynamic Voltage and Frequency Scaling (DVFS). On-die integration of voltage regulators has the potential to increase the energy efficiency of computer systems by enabling power control at a fine granularity in both space and time. The energy conversion efficiency of on-chip regulators, however, is typically much lower than off-chip regulators, which results in significant energy losses. Finegrained power control and high voltage regulator efficiency are difficult to achieve simultaneously, with either emerging on-chip or conventional off-chip regulators.A voltage conversion framework that relies on a hierarchy of off-chip switching regulators and on-chip linear regulators is proposed to enable fine-grained power control with a regulator efficiency greater than 90%. A DVFS control policy that is based on a reinforcement learning (RL) approach is developed to exploit the proposed framework. Percore RL agents learn and improve their control policies independently, while retaining the ability to coordinate their actions to accomplish system level power management objectives. When evaluated on a mix of 14 parallel and 13 multiprogrammed workloads, the proposed voltage conversion framework achieves 18% greater energy efficiency than a conventional framework that uses on-chip switching regulators. Moreover, when the RL based DVFS control policy is used to control the proposed voltage conversion framework, the system achieves a 21% higher energy efficiency over a baseline oracle policy with coarse-grained power control capability.
This paper explores the use of MOS current-mode logic (MCML) as a fast and low noise alternative to static CMOS circuits in microprocessors, thereby improving the performance, energy efficiency, and signal integrity of future computer systems. The power and ground noise generated by an MCML circuit is typically 10-100× smaller than the noise generated by a static CMOS circuit. Unlike static CMOS, whose dominant dynamic power is proportional to the frequency, MCML circuits dissipate a constant power independent of clock frequency. Although these traits make MCML highly energy efficient when operating at high speeds, the constant static power of MCML poses a challenge for a microarchitecture that operates at the modest clock rate and with a low activity factor. To address this challenge, a single-core microarchitecture for MCML is explored that exploits the C-slow retiming technique, and operates at a high frequency with low complexity to save energy. This design principle contrasts with the contemporary multicore design paradigm for static CMOS that relies on a large number of gates operating in parallel at the modest speeds. The proposed architecture generates 10-40× lower power and ground noise, and operates within 13% of the performance (i.e., 1/ExecutionTime) of a conventional, eight-core static CMOS processor while exhibiting 1.6× lower energy and 9% less area. Moreover, the operation of an MCML processor is robust under both systematic and random variations in transistor threshold voltage and effective channel length.Index Terms-Architecture-circuit codesign, energy efficient, low noise, microprocessors, MOS current-mode logic (MCML). 1063-8210
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.