Target detection based on deep learning is developing rapidly. However, small target detection is still a challenge. In this paper, a simple and efficient network for small target detection is proposed. We put forward to improve the detection performance of the small targets in three aspects. First, as the contextual information is important to detect the small targets, we proposed to use ''dilated module'' to expand the receptive field without loss of resolution or coverage. Second, we applied feature fusion in different dilated modules to improve the ability of the network in detecting small targets. Finally, we used ''passthrough module'' to get the finer-grained information from the earlier layer and combined it with the semantic information from the deeper layer. To improve the detection speed of the network, it is proposed to use 1 × 1 convolution to reduce the dimension of the network. We composed small vehicle dataset based on VEDAI dataset and DOTA dataset, respectively, and also analyzed the distribution of the small targets in each dataset. To evaluate the performance of the proposed network, we trained the model on the dataset above and compared with the state-of-the-art target detection algorithms, our approach achieved 80.16% average precision (AP) on VEDAI dataset and 88.63% AP on DOTA dataset and the frames per second (FPS) is 75.4. The AP of our network is much better than the result of the tiny YOLO V3 and is nearly the same as the result of the YOLO V3. However, the FPS of our network is almost the same as that of the tiny YOLO V3.
Target detection is one of the most important research directions in computer vision. Recently, a variety of target detection algorithms have been proposed. Since the targets have varying sizes in a scene, it is essential to be able to detect the targets at different scales. To improve the detection performance of targets with different sizes, a multi-scale target detection algorithm was proposed involving improved YOLO (You Only Look Once) V3. The main contributions of our work include: (1) a mathematical derivation method based on Intersection over Union (IOU) was proposed to select the number and the aspect ratio dimensions of the candidate anchor boxes for each scale of the improved YOLO V3; (2) To further improve the detection performance of the network, the detection scales of YOLO V3 have been extended from 3 to 4 and the feature fusion target detection layer downsampled by 4× is established to detect the small targets; (3) To avoid gradient fading and enhance the reuse of the features, the six convolutional layers in front of the output detection layer are transformed into two residual units. The experimental results upon PASCAL VOC dataset and KITTI dataset show that the proposed method has obtained better performance than other state-of-the-art target detection algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.