2022
DOI: 10.1155/2022/8039281
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

Abstract: To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refined pruning operation is based on the channel-wise importance indexes of each layer and the layer-wise input sparsity of convolutional layers. The method utilizes the characteristics of the native networks without int… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9

Relationship

3
6

Authors

Journals

citations
Cited by 20 publications
(7 citation statements)
references
References 38 publications
0
7
0
Order By: Relevance
“…In terms of dataset ImageNet100 ( Li et al, 2022 ), it is a subset dataset of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2012) for evaluating the performance of DNNs, which are comprised of 100 classifies with 129,026 items which are randomly selected from ILSVRC 2012. For the experiments in the research, ImageNet100 is classified into three parts: the training set, validation set, and test set, with a proportion of 16:4:5.…”
Section: Methodsmentioning
confidence: 99%
“…In terms of dataset ImageNet100 ( Li et al, 2022 ), it is a subset dataset of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2012) for evaluating the performance of DNNs, which are comprised of 100 classifies with 129,026 items which are randomly selected from ILSVRC 2012. For the experiments in the research, ImageNet100 is classified into three parts: the training set, validation set, and test set, with a proportion of 16:4:5.…”
Section: Methodsmentioning
confidence: 99%
“…In many real-world applications, object detection must be performed in a timely and power-saving manner with computational resource constraints. Many other vision tasks have built lightweight models using methods, such as weight quantization [16], [17], network compression [18], computationally efficient architecture design [19], [20], [21], and so on. For some vision tasks, lightweight networks aim to achieve the best tradeoff between accuracy and efficiency, showing their superiority by reducing the model size and FLOPs with a little performance drop [22].…”
Section: B Lightweight Object Detection Modelmentioning
confidence: 99%
“…References [117][118][119] all adopted the method of mixed precision quantization, adopting different quantization strategies according to different data accuracy requirements, making the delay lower and the reasoning accuracy higher. In [120], layered fine pruning was used to optimize VGG13BN and ResNet101, which achieved less than 1% precision loss and greatly improved the operation speed when more than 70% parameters and floatingpoint arithmetic were cut off. Some scholars combined pruning and quantization; used the hybrid pruning method to compress the model; reduced the data bit width to 8 bits through data quantization; and designed the FPGA accelerator to make CNN more flexible, more configurable, and have a higher performance.…”
Section: The Cnn Accelerator Based On Fpgamentioning
confidence: 99%