The Static Random-Access Memory (SRAM) modules used for embedded microprocessor devices consume a large portion of the whole system's power. The memory module consumes static power on keeping awake and dynamic power on memory accesses. The power dissipation of the instruction memory can be limited by using code compression methods, which reduce the memory size. The compression may require the use of variable length instruction formats in the processor. The powerefficient design of variable length instruction fetch and decode units is challenging for static multiple-issue processors, because such architectures have simple hardware to begin with, as they aim for very low power consumption on embedded platforms. The power saved by using these compression approaches, which necessitate more complex logic, is easily lost on inefficient processor design.This thesis proposes an implementation for instruction template-based compression, its decompression and two instruction fetch design alternatives for variable length instruction encoding on Transport Triggered Architecture (TTA), a static multiple-issue exposed data path architecture. Both of the new fetch and decode units are integrated into the TTA-based Co-design Environment (TCE), which is a toolset for rapid designing and prototyping of processors based on TTA.The hardware description of the fetch units is verified on a register transfer level and benchmarked using the CHStone test suite. Furthermore, the fetch units are synthesized on a 40 nm standard cell Application Specific Integrated Circuit (ASIC) technology library for area, performance and power consumption measurements. The power cost of the variable length instruction support is compared to the power savings from memory reduction, which is evaluated using HP Labs' CACTI tool.The compression approach reaches an average program size reduction of 44% at best with a set of test programs, and the total power consumption of the system is reduced. The thesis shows that the proposed variable length fetch designs are sufficiently low-power oriented for TTA processors to benefit from the code compression.
Variable length encoding can considerably decrease code size in VLIW processors by decreasing the amount of bits wasted on encoding No Operation(NOP)s. A processor may have different instruction templates where different execution slots are implicitly NOPs, but all combinations of NOPs may not be supported by the instruction templates. The efficiency of the NOP encoding can be improved by the compiler trying to place NOPs in such way that the usage of implicit NOPs is maximized. Two different methods of optimizing the use of the implicit NOP slots are evaluated: prioritizing function units that have fewer implicit NOPs associated to them, and a post-pass to the instruction scheduler which utilizes the slack of the schedule by rescheduling operations with slack into different instruction words so that the available instruction templates are better utilized. The postpass optimizer saved an average of 2.5 % and at best of 9.1 % instruction memory, without performance loss. Prioritizing function units gave best case instruction memory savings of 12.7 % but the average savings were only 1.0 % and there was in average 5.7 % slowdown for the program.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.