2015 IEEE 26th International Conference on Application-Specific Systems, Architectures and Processors (ASAP) 2015
DOI: 10.1109/asap.2015.7245732
|View full text |Cite
|
Sign up to set email alerts
|

Mixed-length SIMD code generation for VLIW architectures with multiple native vector-widths

Abstract: Abstract-The degree of DLP parallelism in applications is not fixed and varies due to different computational characteristics of applications. On the contrary, most of the processors today include single-width SIMD (vector) hardware to exploit DLP. However, single-width SIMD architectures may not be optimal to serve applications with varying DLP and they may cause performance and energy inefficiency. We propose the usage of VLIW processors with multiple native vector-widths to better serve applications with ch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…SHAVE is a VLIW processor containing a set of functional units which are fed with operands from three different register files [21]. The processor contains optimized functional units such as a branch and repeat unit (BRU), a compare and move unit (CMU), arithmetic units, and Fig.…”
Section: Myriad 2 Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…SHAVE is a VLIW processor containing a set of functional units which are fed with operands from three different register files [21]. The processor contains optimized functional units such as a branch and repeat unit (BRU), a compare and move unit (CMU), arithmetic units, and Fig.…”
Section: Myriad 2 Architecturementioning
confidence: 99%
“…One such effort is Myriad 2 platform from Movidius [20]. It is a low-power multi-processor system on chip (MPSoC) that uses an array of very long instruction word (VLIW) processors with vector and single instruction multiple data (SIMD) execution capabilities [21]. Each processor supports two load and store units (LSUs) to overlap latency of memory operations.…”
Section: Introductionmentioning
confidence: 99%
“…For video applications, the frame rate can be increased by customizing the architecture with parallel execution of the input data set [14,16]. To perceive motion in the video, refreshing of the frames should take place very quickly.…”
Section: Image Quality Enhancementmentioning
confidence: 99%