Detection of small targets in aerial images is still a difficult problem due to the low resolution and background-like targets. With the recent development of object detection technology, efficient and high-performance detector techniques have been developed. Among them, the YOLO series is a representative method of object detection that is light and has good performance. In this paper, we propose a method to improve the performance of small target detection in aerial images by modifying YOLOv5. The backbone is was modified by applying the first efficient channel attention module, and the channel attention pyramid method was proposed. We propose an efficient channel attention pyramid YOLO (ECAP-YOLO). Second, in order to optimize the detection of small objects, we eliminated the module for detecting large objects and added a detect layer to find smaller objects, reducing the computing power used for detecting small targets and improving the detection rate. Finally, we use transposed convolution instead of upsampling. Comparing the method proposed in this paper to the original YOLOv5, the performance improvement for the mAP was 6.9% when using the VEDAI dataset, 5.4% when detecting small cars in the xView dataset, 2.7% when detecting small vehicle and small ship classes from the DOTA dataset, and approximately 2.4% when finding small cars in the Arirang dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.