Infrared small target detection is still a challenge in the field of object detection. At present, although there are many related research achievements, it surely needs further improvement. This paper introduced a new application of severely occluded vehicle detection in the complex wild background of weak infrared camera aerial images, in which more than 50% area of the vehicles are occluded. We used YOLOv4 as the detection model. By applying secondary transfer learning from visible dataset to infrared dataset, the model could gain a good average precision (AP). Firstly, we trained the model in the UCAS_AOD visible dataset, then, we transferred it to the VIVID visible dataset, finally we transferred the model to the VIVID infrared dataset for a second training. Meanwhile, added the hard negative example mining block to the YOLOv4 model, which could depress the disturbance of complex background thus further decrease the false detecting rate. Through experiments the average precision improved from90.34% to 91.92%, the F1 score improved from 87.5% to 87.98%, which demonstrated that the proposed algorithm generated satisfactory and competitive vehicle detection results. INDEX TERMS Infrared aerial image, occlusion, vehicle detection, hard negative example mining, YOLOv4.
Infrared target detection is a popular applied field in object detection as well as a challenge. This paper proposes the focus and attention mechanism-based YOLO (FA-YOLO), which is an improved method to detect the infrared occluded vehicles in the complex background of remote sensing images. Firstly, we use GAN to create infrared images from the visible datasets to make sufficient datasets for training as well as using transfer learning. Then, to mitigate the impact of the useless and complex background information, we propose the negative sample focusing mechanism to focus on the confusing negative sample training to depress the false positives and increase the detection precision. Finally, to enhance the features of the infrared small targets, we add the dilated convolutional block attention module (dilated CBAM) to the CSPdarknet53 in the YOLOv4 backbone. To verify the superiority of our model, we carefully select 318 infrared occluded vehicle images from the VIVID-infrared dataset for testing. The detection accuracy-mAP improves from 79.24% to 92.95%, and the F1 score improves from 77.92% to 88.13%, which demonstrates a significant improvement in infrared small occluded vehicle detection.
Regression loss function in object detection model plays a important factor during training procedure. The IoU based loss functions, such as CIOU loss, achieve remarkable performance, but still have some inherent shortages that may cause slow convergence speed. The paper proposes a Scale-Sensitive IOU(SIOU) loss for the object detection in multi-scale targets, especially the remote sensing images to solve the problem where the gradients of current loss functions tend to be smooth and cannot distinguish some special bounding boxes during training procedure in multi-scale object detection, which may cause unreasonable loss value calculation and impact the convergence speed. A new geometric factor affecting the loss value calculation, namely area difference, is introduced to extend the existing three factors in CIOU loss; By introducing an area regulatory factor γ to the loss function, it could adjust the loss values of the bounding boxes and distinguish different boxes quantitatively. Furthermore, we also apply our SIOU loss to the oriented bounding box detection and get better optimization. Through extensive experiments, the detection accuracies of YOLOv4, Faster R-CNN and SSD with SIOU loss improve much more than the previous loss functions on two horizontal bounding box datasets, i.e, NWPU VHR-10 and DIOR, and on the oriented bounding box dataset, DOTA, which are all remote sensing datasets. Therefore, the proposed loss function has the state-of-the-art performance on multi-scale object detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.