2021
DOI: 10.1109/access.2021.3053259
|View full text |Cite
|
Sign up to set email alerts
|

Power-Efficient Deep Convolutional Neural Network Design Through Zero-Gating PEs and Partial-Sum Reuse Centric Dataflow

Abstract: Convolution neural networks (CNNs) have shown great success in many areas such as object detection and pattern recognition at the cost of extreme high computation complexity and significant external memory access, which makes state-of-the-art deep CNNs difficult to be implemented on resource-constrained portable/wearable devices with limited capacity of battery. To address this design challenge, a power-efficient CNN design through zero-gating processing elements (PEs) and partial-sum reuse centric dataflow is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…We kept the PEs as simple as possible to maximize area efficiency. A zero detection and gating circuit, inspired by works like [7], [15] is included before the multiplier in order to avoid unnecessary switching when any input value is zero. Even though our work focuses on dense CNNs rather than sparse networks, this technique is inexpensive and can save power even at low sparsity levels (see Section III).…”
Section: A Systolic Array Gemm Enginementioning
confidence: 99%
“…We kept the PEs as simple as possible to maximize area efficiency. A zero detection and gating circuit, inspired by works like [7], [15] is included before the multiplier in order to avoid unnecessary switching when any input value is zero. Even though our work focuses on dense CNNs rather than sparse networks, this technique is inexpensive and can save power even at low sparsity levels (see Section III).…”
Section: A Systolic Array Gemm Enginementioning
confidence: 99%
“…DNN accelerators have been developed with various design approaches [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. Due to the data-centric property in recent ASIC-based DNN accelerators, in which a significantly large amount of data should be processed and transferred in and out of the accelerator chips, memory plays an important role.…”
Section: Dnn Acceleratorsmentioning
confidence: 99%
“…Due to the data-centric property in recent ASIC-based DNN accelerators, in which a significantly large amount of data should be processed and transferred in and out of the accelerator chips, memory plays an important role. The typical on-chip global memory architectures can be simply classified into two types, i.e., those which use a unified buffer, such as those in [13,16,26], and those which use separate buffers for input feature maps, filter weights, and partial sums, such as those in [15,17]. Using a multi-bank-based unified global buffer can flexibly change the volume of the on-chip ifmaps, weights, and psums in different layers, while using separated buffers can transact different types of data in parallel.…”
Section: Dnn Acceleratorsmentioning
confidence: 99%
See 1 more Smart Citation