Designing efficient, application-specialized hardware accelerators requires assessing trade-offs between a hardware module's performance and resource requirements. To facilitate hardware design space exploration, we describe Aetherling, a system for automatically compiling data-parallel programs into statically scheduled, streaming hardware circuits. Aetherling contributes a space-and time-aware intermediate language featuring data-parallel operators that represent parallel or sequential hardware modules, and sequence data types that encode a module's throughput by specifying when sequence elements are produced or consumed. As a result, well-typed operator composition in the space-time language corresponds to connecting hardware modules via statically scheduled, streaming interfaces.We provide rules for transforming programs written in a standard data-parallel language (that carries no information about hardware implementation) into equivalent spacetime language programs. We then provide a scheduling algorithm that searches over the space of transformations to quickly generate area-efficient hardware designs that achieve a programmer-specified throughput. Using benchmarks from the image processing domain, we demonstrate that Aetherling enables rapid exploration of hardware designs with different throughput and area characteristics, and yields results that require 1.8-7.9× fewer FPGA slices than those of prior hardware generation systems.
No abstract
With the slowing of Moore’s law, computer architects have turned to domain-specific hardware specialization to continue improving the performance and efficiency of computing systems. However, specialization typically entails significant modifications to the software stack to properly leverage the updated hardware. The lack of a structured approach for updating both the compiler and the accelerator in tandem has impeded many attempts to systematize this procedure. We propose a new approach to enable flexible and evolvable domain-specific hardware specialization based on coarse-grained reconfigurable arrays (CGRAs). Our agile methodology employs a combination of new programming languages and formal methods to automatically generate the accelerator hardware and its compiler from a single source of truth. This enables the creation of design-space exploration frameworks that automatically generate accelerator architectures that approach the efficiencies of hand-designed accelerators, with a significantly lower design effort for both hardware and compiler generation. Our current system accelerates dense linear algebra applications, but is modular and can be extended to support other domains. Our methodology has the potential to significantly improve the productivity of hardware-software engineering teams and enable quicker customization and deployment of complex accelerator-rich computing systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.