Due to the advantages of small size, lightweight, and simple operation, the unmanned aerial vehicle (UAV) has been widely used, and it is also becoming increasingly convenient to capture high-resolution aerial images in a variety of environments. Existing target-detection methods for UAV aerial images lack outstanding performance in the face of challenges such as small targets, dense arrangement, sparse distribution, and a complex background. In response to the above problems, some improvements on the basis of YOLOv5l have been made by us. Specifically, three feature-extraction modules are proposed, using asymmetric convolutions. They are named the Asymmetric ResNet (ASResNet) module, Asymmetric Enhanced Feature Extraction (AEFE) module, and Asymmetric Res2Net (ASRes2Net) module, respectively. According to the respective characteristics of the above three modules, the residual blocks in different positions in the backbone of YOLOv5 were replaced accordingly. An Improved Efficient Channel Attention (IECA) module was added after Focus, and Group Spatial Pyramid Pooling (GSPP) was used to replace the Spatial Pyramid Pooling (SPP) module. In addition, the K-Means++ algorithm was used to obtain more accurate anchor boxes, and the new EIOU-NMS method was used to improve the postprocessing ability of the model. Finally, ablation experiments, comparative experiments, and visualization of results were performed on five datasets, namely CIFAR-10, PASCAL VOC, VEDAI, VisDrone 2019, and Forklift. The effectiveness of the improved strategies and the superiority of the proposed method (YOLO-UAV) were verified. Compared with YOLOv5l, the backbone of the proposed method increased the top-one accuracy of the classification task by 7.20% on the CIFAR-10 dataset. The mean average precision (mAP) of the proposed method on the four object-detection datasets was improved by 5.39%, 5.79%, 4.46%, and 8.90%, respectively.
Pavement disease detection is an important task for ensuring road safety. Manual visual detection requires a significant amount of time and effort. Therefore, an automated road disease identification technique is required to guarantee that city tasks are performed. However, due to the irregular shape and large-scale differences in road diseases, as well as the imbalance between the foreground and background, the task is challenging. Because of this, we created the deep convolution neural network—DASNet, which can be used to identify road diseases automatically. The network employs deformable convolution instead of regular convolution as the feature pyramid’s input, adds the same supervision signal to the multi-scale features before feature fusion, decreases the semantic difference, extracts context information by residual feature enhancement, and reduces the information loss of the pyramid’s top-level feature map. Considering the unique shape of road diseases, imbalance problems between the foreground and background are common, therefore, we introduce the sample weighted loss function. In order to prove the superiority and effectiveness of this method, it is compared to the latest method. A large number of experiments show that this method is superior in accuracy to other methods, specifically, under the COCO evaluation metric, compared with the Faster RCNN baseline, the proposed method obtains a 41.1 mAP and 3.4 AP improvement.
Target detection based on unmanned aerial vehicle (UAV) images has increasingly become a hot topic with the rapid development of UAVs and related technologies. UAV aerial images often feature a large number of small targets and complex backgrounds due to the UAV’s flying height and shooting angle of view. These characteristics make the advanced YOLOv4 detection method lack outstanding performance in UAV aerial images. In light of the aforementioned problems, this study adjusted YOLOv4 to the image’s characteristics, making the improved method more suitable for target detection in UAV aerial images. Specifically, according to the characteristics of the activation function, different activation functions were used in the shallow network and the deep network, respectively. The loss for the bounding box regression was computed using the EIOU loss function. Improved Efficient Channel Attention (IECA) modules were added to the backbone. At the neck, the Spatial Pyramid Pooling (SPP) module was replaced with a pyramid pooling module. At the end of the model, Adaptive Spatial Feature Fusion (ASFF) modules were added. In addition, a dataset of forklifts based on UAV aerial imagery was also established. On the PASCAL VOC, VEDAI, and forklift datasets, we ran a series of experiments. The experimental results reveal that the proposed method (YOLO-DRONE, YOLOD) has better detection performance than YOLOv4 for the aforementioned three datasets, with the mean average precision (mAP) being improved by 3.06%, 3.75%, and 1.42%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.