Abstract-As the demand for high performance computing increases, new approaches have to be found to automate the design of embedded processors. Simultaneously, new tools have to be developed to short the execution time consumption, and simpler design resulting in time to market. These are to be applied for the system architecture to achieve rapid exploration in on power consumption, chip area, and performance constraints. This enables interest in Application Specific Instruction Processors (ASIPs) design and application considerably. It has higher flexibility as compared to dedicated hardware. The current case study focuses on an ASIP design methodology considering the classical parameters computational performance and area as well as energy consumption simultaneously. In this paper, the clock gating is analyzed and designed. Further it is optimized using Fast genetic algorithm (FastGA). The optimization result is shown for ICORE (ISS-core) ASIP for DVB-T acquisition and tracking algorithms. Observation shows a potential of about one order of magnitude in savings of energy for optimization.
Introduction: The application of specific instructions significantly improves energy, performance, and code size of configurable processors. The design of these instructions is performed by the conversion of patterns related to application-specific operations into effective complex instructions. This research was presented at the icitkm Conference, University of Delhi, India in 2017.Methods: Static analysis was a prominent research method during late the 1980’s. However, end-to-end measurements consist of a standard approach in industrial settings. Both static analysis tools perform at a high-level in order to determine the program structure, which works on source code, or is executable in a disassembled binary. It is possible to work at a low-level if the real hardware timing information for the executable task has the desired features.Results: We experimented, tested and evaluated using a H.264 encoder application that uses nine cis, covering most of the computation intensive kernels. Multimedia applications are frequently subject to hard real time constraints in the field of computer vision. The H.264 encoder consists of complicated control flow with more number of decisions and nested loops. The parameters evaluated were different numbers of A partitions (300 slices on a Xilinx Virtex 7each), reconfiguration bandwidths, as well as relations of cpu frequency and fabric frequency fCPU/ffabric. ffabric remains constant at 100MHz, and we selected a multiplicity of its values for fCPU that resemble realistic units. Note that while we anticipate the wcet in seconds (wcetcycles/ f CPU) to be lower (better) with higher fCPU, the wcet cycles increase (at a constant ffabric) because hardware cis perform less computations on the reconfigurable fabric within one cpu cycle.Conclusions: The method is similar to tree hybridization and path-based methods which are less precise, and to the global ipet method, which is more precise. Optimization is evaluated with the Discrete Particle Swarm Optimization (dpso) algorithm for wcet. For several real-world applications involving embedded processors, the proposed technique develops improved instruction sets in comparison to native instruction sets.Originality: For wcet estimation, flow analysis, low-level analysis and calculation phases of the program need to be considered. Flow analysis phase or the high-level of analysis helps to extract the program’s dynamic behavior that gives information on functions being called, number of loop iteration, dependencies among if-statements, etc. This is due to the fact that the analysis is unaware of the execution path corresponding to the longest execution time.Limitations: This path is executed within a kernel iteration that relies upon the nature of mb, either i-mb or p-mb, determined by the motion estimation kernel, that is, its’ input depends on the i-mb and p-mb paths ,which also contain separate cis leading to the instability of the worst-case path, that is, adding more partitions to the current worst-case path can result in the other path becoming the worst case. The pipeline stalls for the reconfiguration delay and continues when entering the kernel once the reconfiguration process finishes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.