Proceedings. 31st Annual International Symposium on Computer Architecture, 2004.
DOI: 10.1109/isca.2004.1310763
|View full text |Cite
|
Sign up to set email alerts
|

The vector-thread architecture

Abstract: The vector-thread (VT) architectural paradigm unifies the vector and multithreaded compute models. The VT abstraction provides the programmer with a control processor and a vector of virtual processors (VPs). The control processor can use vector-fetch commands to broadcast instructions to all the VPs or each VP can use thread-fetches to direct its own control flow. A seamless intermixing of the vector and threaded control mechanisms allows a VT architecture to flexibly and compactly encode application parallel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
61
0

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 78 publications
(62 citation statements)
references
References 10 publications
1
61
0
Order By: Relevance
“…To evaluate data-parallel solutions, we used the Hwacha data-parallel accelerator with Rocket as its scalar control processor. The Hwacha data-parallel accelerator integrates ideas from both vector-thread [6,7] and conventional data-parallel processors to achieve high performance and energy efficiency. TFJ was used to generate optimized implementations for Rocket and Hwacha.…”
Section: Rocket-hwacha Vector Processormentioning
confidence: 99%
“…To evaluate data-parallel solutions, we used the Hwacha data-parallel accelerator with Rocket as its scalar control processor. The Hwacha data-parallel accelerator integrates ideas from both vector-thread [6,7] and conventional data-parallel processors to achieve high performance and energy efficiency. TFJ was used to generate optimized implementations for Rocket and Hwacha.…”
Section: Rocket-hwacha Vector Processormentioning
confidence: 99%
“…In order to scale the number of cores in a CMP above this barrier, and into the numbers of cores proposed for tiled architectures [4,6,19,28,29], it is necessary to resort to scalable (i.e., point-to-point) interconnect types. Such interconnects are suitable not only because their peak bandwidth naturally scales with the number of cores, but also because, due to the short-length wires and low radix, their area overhead is a fixed, independent fraction of the number of cores.…”
Section: Current Cmps and Coherence Mechanismsmentioning
confidence: 99%
“…There have been several proposals for tiled CMP architectures [4,6,19,28,29]. Most of these have focused on novel execution paradigms to exploit ILP and DLP in singlethreaded applications.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Of course, if a longer clock period is employed a smaller number of larger tiles may be used. Such tile-based systems may implement arrays of homogeneous processor/cache tiles [9], [10], finer-grain computing fabrics [14] or networks of heterogeneous IP blocks. Such approaches provide highly reconfigurable platforms for a wide range of performance hungry applications.…”
Section: Introductionmentioning
confidence: 99%