2021
DOI: 10.3390/fi13120307
|View full text |Cite
|
Sign up to set email alerts
|

An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

Abstract: Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the pre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…The FLOPS is related to the device’s energy consumption (the higher the FLOPS, the higher the energy consumption). The floating-point operation numbers are computed as follows [ 63 ]: For a convolution layer with n filters of size applied to feature maps (W: width; H: height; C: channels), with P: number of parameters: For a max-pooling layer or an upsampling layer with a window of size on feature maps ( W : width; H : height; C : channels): …”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The FLOPS is related to the device’s energy consumption (the higher the FLOPS, the higher the energy consumption). The floating-point operation numbers are computed as follows [ 63 ]: For a convolution layer with n filters of size applied to feature maps (W: width; H: height; C: channels), with P: number of parameters: For a max-pooling layer or an upsampling layer with a window of size on feature maps ( W : width; H : height; C : channels): …”
Section: Resultsmentioning
confidence: 99%
“…The FLOPS is related to the device's energy consumption (the higher the FLOPS, the higher the energy consumption). The floating-point operation numbers are computed as follows [63]:…”
Section: Lightweight Measuresmentioning
confidence: 99%
“…Object detection models could fail to detect when facing the varying sizes of the objects that have low resolution. RPN in the Mask R-CNN uses multi-scale anchor box to enhance the detection accuracy by extracting features at the multiple convolution levels of the object [ 37 , 38 ]. The ROI head generated mask predictions, classifications, and bounding box predictions.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…Multi-scale feature fusion network is widely used in computer vision tasks such as target detection and image classification. Varadarajan et al. (2021) designed an object detection network composed of 22 convolution layers.…”
Section: Introductionmentioning
confidence: 99%
“…Multi-scale feature fusion network is widely used in computer vision tasks such as target detection and image classification. Varadarajan et al (2021) designed an object detection network composed of 22 convolution layers. By using multi-scale feature fusion technology, the network can well identify objects of different sizes and shapes from images.…”
Section: Introductionmentioning
confidence: 99%