2022 IEEE International Conference on Image Processing (ICIP) 2022
DOI: 10.1109/icip46576.2022.9897752
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Abstract: Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems, model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfigurable systems such as FPGAs for the embedded system to work properly. In this paper, we present HLSinf, an open source framework for the development of custom NN ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 17 publications
0
0
0
Order By: Relevance
“…Inference time has been measured from the host. To carry out the tests we used the HLSinf [20] accelerator configured to use FP32 data type. Although it is advisable to use fixed point data types in FPGAs, the accelerator achieves better performance with FP32.…”
Section: Design Evaluationmentioning
confidence: 99%
“…Inference time has been measured from the host. To carry out the tests we used the HLSinf [20] accelerator configured to use FP32 data type. Although it is advisable to use fixed point data types in FPGAs, the accelerator achieves better performance with FP32.…”
Section: Design Evaluationmentioning
confidence: 99%
“…The AI hardware accelerator, AI-Inference library, and an acceleration runtime method created in this work comprise the SELENE Accelerator Framework (SAF). It works as follows: first, the European Distributed Deep Learning (EDDL) [23] inference library initializes the HLSInf [24] HW accelerator using the generated JSON configuration file. Next, the inference input data (i.e., the data to be processed) is loaded in the main memory shared with the accelerator.…”
mentioning
confidence: 99%