A Coarse-Grained Reconfigurable Architecture (CGRA) is a processing platform which constitutes an interconnection of coarse-grained computation units. CGRAs are a well-researched topic and the design space of a CGRA is quite large. A typical CGRA requires many processing elements and a configuration cache for reconfiguration of its processing element array. However, such a structure consumes significant area and power. Therefore, designing cost-effective CGRA has been a serious concern for reliability of CGRA-based embedded systems. Reusable Context Pipelining is a universal approach in reducing power and enhancing performance for CGRA because it can be achieved by closing the power performance gap between the low powers oriented spatial mapping and high performance oriented temporal mapping. By focusing on the processor elements the power and area will be reduced. The processing element consists of arithmetic and logic unit, array multiplier, saturation arithmetic logic and multiplexer. The above components are designed for processing element which are used in CGRA. Each components in processing element are simulated using Xilinx ISE. Finally, simulation results and final transient response of the schematic design of processing element are visualized. Keywords: Coarse-Grained Reconfigurable Architecture (CGRA), Reusable Context Pipelining, processing element.
I. INTRODUCTIONNowadays, stream-based applications, such as multimedia, telecommunications, signal processing, and data encryptions, are the dominant workloads in many electronic systems. The real-time constraints of these applications, especially over portable devices, often have stringent energy and performance requirements. Many other military applications, including real-time synthetic aperture radar imaging, automatic target recognition, surveillance video processing, optical inspection, and cognitive radio systems, have similar needs. General purpose processors (GPPs), are widely used in conventional data-path oriented applications due to their flexibility and ease of use. However, they cannot meet the increasing requirements on performance, cost, and energy in the data streaming application domain due to their sequential software executions. The application-specific integrated circuits (ASICs) become inevitably a customized solution to meet these ever-increasing demands for highly repetitive parallel computations. It is reported that they are potentially two to three orders of magnitude more efficient than the processors in terms of combined performances of computational power, energy consumption, and cost. The long design cycle and high Non-Recursive Engineering (NRE) cost also become an obstacle to meet the stringent cost and time-to-market requirements. Reconfigurable architectures (RAs) have long been proposed as a way to achieve a balance between flexibility as of GPP and performance as of ASICs [1]. The hardware-based RA implementation is able to explore the spatial parallelism of the computing tasks in targeted applications, meanwhi...