2021
DOI: 10.3390/s21061955
|View full text |Cite
|
Sign up to set email alerts
|

Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing

Abstract: The astounding development of optical sensing imaging technology, coupled with the impressive improvements in machine learning algorithms, has increased our ability to understand and extract information from scenic events. In most cases, Convolution neural networks (CNNs) are largely adopted to infer knowledge due to their surprising success in automation, surveillance, and many other application domains. However, the convolution operations’ overwhelming computation demand has somewhat limited their use in rem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 46 publications
0
5
0
Order By: Relevance
“…To bring any of our CNN-based systems to the real world, we will embed it into a programmable logic device. Although it is well-known that CNN-based algorithms are computationally intense and require vast computational resources and dynamic power for computation of convolutional operations, in recent years, some programmable devices have been specifically developed to run these kinds of algorithms in real-time [64]. In this respect, some researchers have already successfully tested a variety of hardware implementation methods for different CNN-based structures, mostly based on field-programmable gate arrays (FPGA) architectures [65,66].…”
Section: Discussionmentioning
confidence: 99%
“…To bring any of our CNN-based systems to the real world, we will embed it into a programmable logic device. Although it is well-known that CNN-based algorithms are computationally intense and require vast computational resources and dynamic power for computation of convolutional operations, in recent years, some programmable devices have been specifically developed to run these kinds of algorithms in real-time [64]. In this respect, some researchers have already successfully tested a variety of hardware implementation methods for different CNN-based structures, mostly based on field-programmable gate arrays (FPGA) architectures [65,66].…”
Section: Discussionmentioning
confidence: 99%
“…This behaviour is common for all the platforms and algorithms, except for the ConvDirect on the Xavier, where the scalability is limited by the improper use of cache memories in a multithreaded scenario. 5 Leaving apart this outlying result, the algorithm Focusing on energy efficiency, we observe different trends depending on the selected NVP model, number of threads, and platform. The first observation is that the best energy efficiency is not always obtained by increasing the number of threads.…”
Section: Performance and Energy Efficiency Scalabilitymentioning
confidence: 94%
“…ARM Cortex-M CPUs) or low-power processors (e.g. ARM Cortex-A CPUs), the optimisation of this operator is strongly focused on reducing its energy consumption [5].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The errors of each set of training data are summed up. In stochastic gradient methods [70][71][72][73][74][75][76][77], the cost and sum of errors is used to update current model parameters to reduce the distance from the optimal point in the parameter space. The equation of binary cross entropy is shown as follows:…”
Section: Expansion Joint Device Recognitionmentioning
confidence: 99%