A highly energy efficient reconfigurable accelerator called CMA (Cool Mega-Array) is proposed. It consists of a large Processing Element (PE) array without memory elements for maintain result of ALU and configuration data, a small simple programmable micro controller for data management, and the data memory. Unlike traditional coarse grained reconfigurable processors, the power consumption for hardware context switching, storing intermediate data in registers, and clock distribution for them are eliminated from PE array which occupies large area of a chip. Configuration registers are collected to small area of micro controller. The data flow graph mapped on the PE array is static during execution. Various application programs can be implemented by making the best use of flexible data management instructions with the micro controller. When the delay time in the PE array is longer than the data handling time with the micro controller, the supply voltage for the PE array is scaled to reduce the power consumption without degrading the performance. In the opposite case, wave pipelining is applied to enhance PE array performance. A prototype chip CMA-1 with 8 × 8 PE array with 24-bit data width was fabricated in 2.1 × 4.2mm 2 65-nm CMOS technology, and achieves 2.4-GOPS/11.2-mW sustained performance. This energy efficiency is comparable to that of the most energy efficient accelerators that have been reported.
A Dynamically Reconfigurable Processor Array (DRPA) is consisting of a number of PEs, and its interconnection of PE array gives a large effect on the total area, energy and performance. However, there is no study of DRPAs focused on their PE network. In this paper, we designed four types of DRPAs based on MuCCRA, developed in MuCCRA(Multi-Core Configurable Reconfigurable Architecture) project. They have different topologies, which are direct interconnection, island-style interconnection and two kinds of hybrid interconnection. They are evaluated using three different PE array sizes. As a result, a MuCCRA with hybrid network, which has the highest degree of connectivity, is able to execute DCT faster than the other MuCCRAs, but requires more area by 20% than MuCCRA with direct network. On energy consumption, MuCCRA with direct links consumes 7 times more energy at maximum than islandstyle-interconnected MuCCRAs.
The Partially Fixed Configuration Mapping (PFCM) is a context mapping technique for Dynamically ReconfigurableProcessor Array (DRPA) focusing on reducing the power consumption. It assigns operations into Processing Elements (PEs) so as to keep the configuration of the previous context as possible. It reduces the changing part of the datapath structure on the PE array as well as its switching frequency. Preliminary evaluation results show that it can reduce the computing power by 6.7% -11.3%. The demonstration shows the power reduction directly by using the real chip MuCCRA-3, a prototype of DRPA executing signal processing applications with and without applying PFCM. The design environment for using PFCM is also exhibited.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.