Purpose
The purpose of this paper is to use the most popular deep learning algorithm to complete the vehicle detection in the urban area of MiniSAR image, and provide reliable means for ground monitoring.
Design/methodology/approach
An accurate detector called the rotation region-based convolution neural networks (CNN) with multilayer fusion and multidimensional attention (M2R-Net) is proposed in this paper. Specifically, M2R-Net adopts the multilayer feature fusion strategy to extract feature maps with more extensive information. Next, the authors implement the multidimensional attention network to highlight target areas. Furthermore, a novel balanced sampling strategy for hard and easy positive-negative samples and a global balanced loss function are applied to deal with spatial imbalance and objective imbalance. Finally, rotation anchors are used to predict and calibrate the minimum circumscribed rectangle of vehicles.
Findings
By analyzing many groups of experiments, the validity and universality of the proposed model are verified. More importantly, comparisons with SSD, LRTDet, RFCN, DFPN, CMF-RCNN, R3Det, SCRDet demonstrate that M2R-Net has state-of-the-art detection performance.
Research limitations/implications
The progress in the field of MiniSAR application has been slow due to strong speckle noise, phase error, complex environments and a low signal-to-noise ratio. In addition, four kinds of imbalances, i.e. spatial imbalance, scale imbalance, class imbalance and objective imbalance, in object detection based on the CNN greatly inhibit the optimization of detection performance.
Originality/value
This research can not only enrich the means of daily traffic monitoring but also be used for enemy intelligence reconnaissance in wartime.
Drone detection is a significant research topic due to the potential security threats posed by the misuse of drones in both civilian and military domains. However, traditional drone detection methods are challenged by the drastic scale changes and complex ambiguity during drone flight, and it is difficult to detect small target drones quickly and efficiently. We propose an information-enhanced model based on improved YOLOv5 (TGC-YOLOv5) for fast and accurate detection of small target drones in complex environments. The main contributions of this paper are as follows: First, the Transformer encoder module is incorporated into YOLOv5 to augment attention toward the regions of interest. Second, the Global Attention Mechanism (GAM) is embraced to mitigate information diffusion among distinct layers and amplify the global cross-dimensional interaction features. Finally, the Coordinate Attention Mechanism (CA) is incorporated into the bottleneck part of C3, enhancing the extraction capability of local information for small targets. To enhance and verify the robustness and generalization of the model, a small target drone dataset (SUAV-DATA) is constructed in all-weather, multi-scenario, and complex environments. The experimental results show that based on the SUAV-DATA dataset, the AP value of TGC-YOLOv5 reaches 0.848, which is 2.5% higher than the original YOLOv5, and the Recall value of TGC-YOLOv5 reaches 0.823, which is a 3.8% improvement over the original YOLOv5. The robustness of our proposed model is also verified on the Real-World open-source image dataset, achieving the best accuracy in light, fog, stain, and saturation pollution images. The findings and methods of this paper have important significance and value for improving the efficiency and precision of drone detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.