Spatially-tiled architectures, such as Coarse-Grained Reconfigurable Arrays (CGRAs), are powerful architectures for accelerating applications in the digital-signal processing, embedded, and scientific computing domains. In contrast to Field-Programmable Gate Arrays (FPGAs), another common accelerator, they typically time-multiplex their processing elements and are word rather than bit-oriented. These differences lead us to re-examine some of the traditional architecture choices made for FPGAs as we move to these coarser-granularity architectures. In this paper we study the efficiency of time-multiplexing global interconnect as architectures scale from single-bit to multi-bit datapaths.Using the Mosaic infrastructure, we analyzed the design trade-offs involved in static vs. time-multiplexed routing for global interconnect channels, as well as the benefit of including a dedicated bit-wide control interconnect to supplement the word-wide datapath of a CGRA. We show that a time-multiplexed interconnect is beneficial in these coarsegrained systems, reducing the area-energy product to 0.32× the area-energy product of a fully static interconnect. We also show that for our benchmarks, which include single-bit control logic, providing both word and bit-wide interconnect resources further reduces the area-energy product to 0.94× that of an exclusively word-wide interconnect.
Efficient storage in spatial processors is increasingly important as such devices get larger and support more concurrent operations. Unlike sequential processors that rely heavily on centralized storage, e.g. register files and embedded memories, spatial processors require many small storage structures to efficiently manage values that are distributed throughout the processor's fabric. The goal of this work is to determine the advantages and disadvantages of different architectural structures for storing values on-chip when optimizing for energy efficiency as well as area.Examination of applications for coarse-grained reconfigurable arrays (CGRAs) shows that most values are short-lived; they are produced and consumed quickly, but the distribution of value lifetimes has a reasonably long tail. We take advantage of this distribution to optimize register storage structures for managing short-, medium-, and long-lived values.We show that using a combination of register storage structures, each tailored for values with different lifetimes, provides a reduction in overall area-energy product to 0.69× the area-energy of the baseline architecture, without loss of performance. Finally we provide energy profiles, characteristics, and comparisons of each register structure to enable architects to guide future design choices.
Coarse Grained Reconfigurable Arrays (CGRAs) are typically very efficient for a single task. However all functional units are required to perform in lock step, wasting resources and making complex programming flows difficult. Massively Parallel Processor Arrays (MPPAs) excel at executing unrelated tasks simultaneously, but limit the amount of resources dedicated to a single task. We propose an architecture with an MPPA's design flexibility and a CGRA's throughput, capable of processing and transferring data in a pre-compiled schedule, with dynamic transfers between components. Alternative interconnect strategies are compared for silicon area cost and power utilization.
When utilizing reconfigurable hardware there are many applications that will require more memory than is available in a single hardware block. While FPGAs have tools and mechanisms for building logically larger memories, it often requires developer intervention on word-oriented devices like Massively Parallel Processor Arrays (MPPAs). We examine building larger memories on the Ambric MPPA. Building an efficient structure requires low-level development and analysis of latency and bandwidth effects of network and protocol choices. We build a network that only requires only five instructions per transaction after optimization. The resource use and performance suggests architectural enhancements that should be considered for future devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.