Massive data centers and cloud computing infrastructures enable abundant data exchange over the Internet. As the demand for data transfer has increased, the number of data-generating devices and their power consumption have grown exponentially in the past few years. According to recent reports, the overall energy usage of information infrastructure in 2018 was estimated as 198 terawatt hours or almost 1% of demand for electricity in the world. [1] This ratio will continue to increase substantially in the era of abundant data, resulting in an unmanageable level of power consumption. [1] The overall power consumption of a computing system is determined by the data processing procedure as well as the power consumption of each single device and their density. The development of low-power devices has been overviewed in existing literatures by adopting either new device structures or new materials. [2-6] It has become more important to develop highly efficient processing units to minimize the overall power consumption of computing systems. Dennard predicted the reduction of switching power with the downward scaling of the feature size of the transistor, [7] and this trend ended around 2004. The operating voltage and current reduction of the transistor stagnated despite the evershrinking size of the transistor because of the increasing leakage current. The increase in operating frequency and the transistor density further exacerbates the power consumption of a computing system, which generates a large amount of heat and impedes power efficiency. Another challenge for power consumption is the intensive data transfer between data processing and memory units in conventional Von Neumann or Princeton computing architecture, the so-called "memory wall." [8] In this computing paradigm, the computing unit can only process one task at a certain interval and wait for memory to update its results, because both data and instructions are stored in the same memory space, which greatly limits the throughput and causes idle power consumption. Although mechanisms like cache and branch prediction can partially eliminate the issues, the "memory wall" still poses a grand challenge for massive data interchanging in modern processor technology. The prevalent synchronous clock design exacerbates the power problem, because the updates of all state-holding units are driven by a uniformly distributed global clock. Regardless of the usefulness of the task, all units are forced to respond to the arrival of clock edge. Thus, some unassigned units run idly and consume unnecessary power. In contrast to synchronous clock distribution, asynchronous distribution uses handshaking. The unparallelism issues also confine the power efficiency of high-performance computers. To fight against clock skew caused by uneven data transmission, a global clock with a high slew rate and elaborated frequency margin is required, which scarifies the area and power consumption. [9] As a result, a synchronous
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.