We present Synchroscalar, a tile-based architecture forembedded processing that is designed to provide the flexibilityof DSPs while approaching the power efficiency ofASICs. We achieve this goal by providing high parallelismand voltage scaling while minimizing control and communicationcosts. Specifically, Synchroscalar uses columnsof processor tiles organized into statically-assignedfrequency-voltage domains to minimize power consumption.Furthermore, while columns use SIMD control to minimizeoverhead, data-dependent computations can besupported by extremely flexible statically-scheduled communicationbetween columns.We provide a detailed evaluation of Synchroscalar includingSPICE simulation, wire and device models, synthesisof key components, cycle-level simulation, andcompiler- and hand-optimized signal processing applications.We find that the goal of meeting, not exceeding, performancetargets with data-parallel applications leads todesigns that depart significantly from our intuitions derivedfrom general-purpose microprocessor design. Inparticular, synchronous design and substantial global interconnectare desirable in the low-frequency, low-powerdomain. This global interconnect supports parallelizationand reduces processor idle time, which are critical to energyefficient implementations of high bandwidth signalprocessing. Overall, Synchroscalar provides programmabilitywhile achieving power efficiencies within 8-30X ofknown ASIC implementations, which is 10-60X better thanconventional DSPs. In addition, frequency-voltage scalingin Synchroscalar provides between 3-32% power savingsin our application suite.
No abstract
Abstract. Embedded devices have hard performance targets and severe power and area constraints that depart significantly from our design in tuitions derived from general-purpose microprocessor design. This paper describes our initial experiences in designing Synchroscalar, a tile-based embedded architecture targeted for multi-rate signal processing applica tions.We present a preliminary design of the Synchroscalar architecture and some design space exploration in the context of important signal process ing kernels. In particular, we find that synchronous design and substan tial global interconnect are desirable in the low-frequency, low-power do main. This global interconnect enables parallelization and reduces pro cessor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Furthermore, statically-scheduled communication and SIMD computation keep control overheads low and energy efficiency high.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.