2021
DOI: 10.1109/jetcas.2021.3129415
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference

Abstract: Deep neural networks (DNNs) are being prototyped for a variety of artificial intelligence (AI) tasks including computer vision, data analytics, robotics, etc. The efficacy of DNNs coincides with the fact that they can provide state-ofthe-art inference accuracy for these applications. However, this advantage comes from the high computational complexity of the DNNs in use. Hence, it is becoming increasingly important to scale these DNNs so that they can fit on resource-constrained hardware and edge devices. The … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
9
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(11 citation statements)
references
References 100 publications
0
9
0
Order By: Relevance
“…( 2) We use ESC-10 dataset [59] for audio anomaly detection and deploy an transformer-based model [60]. (3) We use UCI HAR dataset [42] for motion signal-based human activity recognition and deploy a LSTM-based model. (4) We use MoCap dataset [61] for training a motion signalbased user identification (12 users) model, using a LSTMbased architecture, and deploy it as the inference workload.…”
Section: Implementation and Configurationsmentioning
confidence: 99%
See 1 more Smart Citation
“…( 2) We use ESC-10 dataset [59] for audio anomaly detection and deploy an transformer-based model [60]. (3) We use UCI HAR dataset [42] for motion signal-based human activity recognition and deploy a LSTM-based model. (4) We use MoCap dataset [61] for training a motion signalbased user identification (12 users) model, using a LSTMbased architecture, and deploy it as the inference workload.…”
Section: Implementation and Configurationsmentioning
confidence: 99%
“…T HE increased computing power of mobile devices and the growing demand for real-time sensor data analytics have created a trend of mobile-centric artificial intelligence (AI) [2], [3], [4], [5]. It is estimated that over 80% of enterprise IoT projects will incorporate AI by 2022.…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, these time-critical applications should often operate on an edge device whose computing power is limited. To port these applications on an edge device, the network compression that optimizes the neural network to satisfy both the time constraints and the performance requirements is essential [ 1 , 2 , 3 , 4 , 5 ]. Network compression consists of network simplification, which simplifies the network architecture, and parameter quantization, which compresses the bit width of parameters to lower than floating point.…”
Section: Introductionmentioning
confidence: 99%
“…Neural networks require massive computational and memory resources, which make them hard to be deployed in such environments, including lightweight architectures. Thus, optimising DNNs has been a major research topic (Chung and Abdelrahman 2020;Lee et al 2020;Mazumder et al 2021) and has made a series of progress, such as accelerating multi-DNN workloads based on heterogeneous dataflow Kwon et al (2021), compressing DNNs with vectorized weight quantization Gong et al (2021), the delay-aware DNN inference technique Li et al (2021), the error compensation for low-voltage DNN accelerators Ji et al (2021), pruning of redundancy for DNNs ; Ahn and Kim (2022); Camci et al (2022), etc. Although the performance of DNNs has been greatly improved with the support of various strategies including these optimization techniques as well as high-efficiency libraries like oneDNN, there is still much room for further performance improvement.…”
Section: Introductionmentioning
confidence: 99%