2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2019
DOI: 10.1109/aicas.2019.8771481
|View full text |Cite
|
Sign up to set email alerts
|

Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference

Abstract: To enable energy-efficient embedded execution of Deep Neural Networks (DNNs), the critical sections of these workloads, their multiply-accumulate (MAC) operations, need to be carefully optimized. The SotA pursues this through runtime precision-scalable MAC operators, which can support the varying precision needs of DNNs in an energy-efficient way. Yet, to implement the adaptable precision MAC operation, most SotA solutions rely on separately optimized low precision multipliers and a precision-variable accumula… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 32 publications
(16 citation statements)
references
References 13 publications
0
16
0
Order By: Relevance
“…The concepts of Sum Apart (SA) and Sum Together (ST) were introduced at PE level by Mei et. al [16] to qualify two opposite ways of accumulating subword-parallel computations: SA keeps the parallel-generated products separately, while ST sums them together to form one single output result. These concepts can be applied to differentiate algorithm-level characteristics of neural-network workloads.…”
Section: A Sa and St At Algorithm Levelmentioning
confidence: 99%
See 3 more Smart Citations
“…The concepts of Sum Apart (SA) and Sum Together (ST) were introduced at PE level by Mei et. al [16] to qualify two opposite ways of accumulating subword-parallel computations: SA keeps the parallel-generated products separately, while ST sums them together to form one single output result. These concepts can be applied to differentiate algorithm-level characteristics of neural-network workloads.…”
Section: A Sa and St At Algorithm Levelmentioning
confidence: 99%
“…The Sum Together (ST) version of the SWP MAC unit, introduced by Mei et. al [16] is also a 2D symmetric scalable architecture, based on an array multiplier. But unlike SWP SA, SWP ST adds all subword results together by activating the array multiplier into an opposite diagonal pattern, as shown in Fig.…”
Section: G Subword-parallel St (St)mentioning
confidence: 99%
See 2 more Smart Citations
“…It has also highlighted that less scalability levels can be a good trade-off thanks to lower circuit overheads. Future works could propose a more extensive analysis and cover additional configurable or low-precision design techniques [8]- [10].…”
Section: Discussionmentioning
confidence: 99%