In view of the existence of remote sensing images with large variations in spatial resolution, small and dense objects, and the inability to determine the direction of motion, all these components make object detection from remote sensing images very challenging. In this paper, we propose a single-stage detection network based on YOLOv5. This method introduces the MS Transformer module at the end of the feature extraction network of the original network to enhance the feature extraction capability of the network model and integrates the Convolutional Block Attention Model (CBAM) to find the attention area in dense scenes. In addition, the YOLOv5 target detection network is improved by incorporating a rotation angle approach from the a priori frame design and the bounding box regression formulation to make it suitable for rotating frame-based detection scenarios. Finally, the weighted combination of the two difficult sample mining methods is used to improve the focal loss function, so as to improve the detection accuracy. The average accuracy of the test results of the improved algorithm on the DOTA data set is 77.01%, which is higher than the previous detection algorithm. Compared with the average detection accuracy of YOLOv5, the average detection accuracy is improved by 8.83%. The experimental results show that the algorithm has higher detection accuracy than other algorithms in remote sensing scenes.
Infrastructure along the highway refers to various facilities and equipment: bridges, culverts, traffic signs, guardrails, etc. New technologies such as artificial intelligence, big data, and the Internet of Things are driving the digital transformation of highway infrastructure towards the future goal of intelligent roads. Drones have emerged as a promising application area of intelligent technology in this field. They can help achieve fast and precise detection, classification, and localization of infrastructure along highways, which can significantly enhance efficiency and ease the burden on road management staff. As the infrastructure along the road is exposed to the outdoors for a long time, it is easily damaged and obscured by objects such as sand and rocks; on the other hand, based on the high resolution of the images taken by Unmanned Aerial Vehicles (UAVs), the variable shooting angles, complex backgrounds, and high percentage of small targets mean the direct use of existing target detection models cannot meet the requirements of practical applications in industry. In addition, there is a lack of large and comprehensive image datasets of infrastructure along highways from UAVs. Based on this, a multi-classification infrastructure detection model combining multi-scale feature fusion and an attention mechanism is proposed. In this paper, the backbone network of the CenterNet model is replaced with ResNet50, and the improved feature fusion part enables the model to generate fine-grained features to improve the detection of small targets; furthermore, the attention mechanism is added to make the network focus more on valuable regions with higher attention weights. As there is no publicly available dataset of infrastructure along highways captured by UAVs, we filter and manually annotate the laboratory-captured highway dataset to generate a highway infrastructure dataset. The experimental results show that the model has a mean Average Precision (mAP) of 86.7%, an improvement of 3.1 percentage points over the baseline model, and the new model performs significantly better than other detection models overall.
In view of the fact that the aerial images of UAVs are usually taken from a top-down perspective, there are large changes in spatial resolution and small targets to be detected, and the detection method of natural scenes is not effective in detecting under the arbitrary arrangement of remote sensing image direction, which is difficult to apply to the detection demand scenario of road technology status assessment, this paper proposes a lightweight network architecture algorithm based on MobileNetv3-YOLOv5s (MR-YOLO). First, the MobileNetv3 structure is introduced to replace part of the backbone network of YOLOv5s for feature extraction so as to reduce the network model size and computation and improve the detection speed of the target; meanwhile, the CSPNet cross-stage local network is introduced to ensure the accuracy while reducing the computation. The focal loss function is improved to improve the localization accuracy while increasing the speed of the bounding box regression. Finally, by improving the YOLOv5 target detection network from the prior frame design and the bounding box regression formula, the rotation angle method is added to make it suitable for the detection demand scenario of road technology status assessment. After a large number of algorithm comparisons and data ablation experiments, the feasibility of the algorithm was verified on the Xinjiang Altay highway dataset, and the accuracy of the MR-YOLO algorithm was as high as 91.1%, the average accuracy was as high as 92.4%, and the detection speed reached 96.8 FPS. Compared with YOLOv5s, the p-value and mAP values of the proposed algorithm were effectively improved. It can be seen that the proposed algorithm improves the detection accuracy and detection speed while greatly reducing the number of model parameters and computation.
Point cloud processing based on deep learning is developing rapidly. However, previous networks failed to simultaneously extract inter-feature interaction and geometric information. In this paper, we propose a novel point cloud analysis module, CGR-block, which mainly uses two units to learn point cloud features: correlated feature extractor and geometric feature fusion. CGR-block provides an efficient method for extracting geometric pattern tokens and deep information interaction of point features on disordered 3D point clouds. In addition, we also introduce a residual mapping branch inside each CGR-block module for the further improvement of the network performance. We construct our classification and segmentation network with CGR-block as the basic module to extract features hierarchically from the original point cloud. The overall accuracy of our network on the ModelNet40 and ScanObjectNN benchmarks achieves 94.1% and 83.5%, respectively, and the instance mIoU on the ShapeNet-Part benchmark also achieves 85.5%, proving the superiority of our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.