2013
DOI: 10.1145/2518037.2491464
|View full text |Cite
|
Sign up to set email alerts
|

Exploring the Tradeoffs between Programmability and Efficiency in Data-Parallel Accelerators

Abstract: We present a taxonomy and modular implementation approach for data-parallel accelerators, including the MIMD, vector-SIMD, subword-SIMD, SIMT, and vector-thread (VT) architectural design patterns. We have developed a new VT microarchitecture, Maven, based on the traditional vector-SIMD microarchitecture that is considerably simpler to implement and easier to program than previous VT designs. Using an extensive design-space exploration of full VLSI implementations of many accelerator design points, we evaluate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2013
2013
2015
2015

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 8 publications
0
14
0
Order By: Relevance
“…[22,21] propose vector thread architectures, a hybrid of SIMD and SIMT (Single Instruction Multiple Thread) that are designed specifically to improve parallel loops with irregular data access and control flow. Qualcomm Hexagon [7] is a VLIW DSP with hardware multi-threading and SIMD functional units is optimized for mobile heterogeneous computing.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…[22,21] propose vector thread architectures, a hybrid of SIMD and SIMT (Single Instruction Multiple Thread) that are designed specifically to improve parallel loops with irregular data access and control flow. Qualcomm Hexagon [7] is a VLIW DSP with hardware multi-threading and SIMD functional units is optimized for mobile heterogeneous computing.…”
Section: Related Workmentioning
confidence: 99%
“…Recent industry and academic efforts have focused on processor customization as a solution to improve performance and energy efficiency, also taking advantage of rising transistor count. It has been well established that customizing processor data paths and data storage elements to suit the data flow of specific applications, which subsequently reduces overheads due to instruction fetching and decoding, can lead to improved performance and energy efficiencies [16,27,5,15,8,14,22]. Most of these heterogeneous architectures work based on the principle of executing sequential code on a general purpose core and offloading computation with data-level parallelism onto specialized energy efficient functional units.…”
Section: Introductionmentioning
confidence: 99%
“…Such compute-intensive workloads are often targeted by single instruction multiple data (SIMD) architectures [73], [33], [53], [52], [49] to exploit the data parallelism that is often inherent in these applications. Accelerator-rich platforms [55], [83], [26], [14], in particular, are well-suited for targeting these workloads.…”
Section: Introductionmentioning
confidence: 99%
“…The VIPERS soft VP is a general-purpose accelerator that can achieve a 44Â speedup compared to the Nios II scalar processor [7]; it increase A major challenge with these VPs is slow memory accesses. Comprehensive explorations of MIMD, vector SIMD and vector thread architectures in handling regular and irregular DLP efficiently confirm that vector-based microarchitectures are more area and energy efficient compared to their scalar counterparts even for irregular DLP [14]. Lo et al [15] introduced an improved SIMD architecture targeted at video processing.…”
Section: Introductionmentioning
confidence: 99%