There are a large number of studies on geospatial object detection. However, many existing methods only focus on either accuracy or speed. Methods with both fast speed and high accuracy are of great importance in some scenes, like search and rescue, and military information acquisition. In remote sensing images, there are some targets that are small and have few textures and low contrast compared with the background, which impose challenges on object detection. In this paper, we propose an accurate and fast single shot detector (AF-SSD) for high spatial remote sensing imagery to solve these problems. Firstly, we design a lightweight backbone to reduce the number of trainable parameters of the network. In this lightweight backbone, we also use some wide and deep convolutional blocks to extract more semantic information and keep the high detection precision. Secondly, a novel encoding–decoding module is employed to detect small targets accurately. With up-sampling and summation operations, the encoding–decoding module can add strong high-level semantic information to low-level features. Thirdly, we design a cascade structure with spatial and channel attention modules for targets with low contrast (named low-contrast targets) and few textures (named few-texture targets). The spatial attention module can extract long-range features for few-texture targets. By weighting each channel of a feature map, the channel attention module can guide the network to concentrate on easily identifiable features for low-contrast and few-texture targets. The experimental results on the NWPU VHR-10 dataset show that our proposed AF-SSD achieves superior detection performance: parameters 5.7 M, mAP 88.7%, and 0.035 s per image on average on an NVIDIA GTX-1080Ti GPU.
Pedestrian detection plays an important role in some areas such as autonomous driving, but due to heavy occlusion and various scales, it is still challenging. In this paper, we propose an improved pedestrian detection method called DA-Net based on the two-stage detector Feature Pyramid Network (FPN). DA-Net adds Dense Connected Block (DCB), a combination of channel-wise attention module (CWAM) and global attention module (GAM) to the network. FPN can produce features with various scales and semantic information, which is good for the detection of pedestrians on various scales. Due to many small-scale targets in pedestrian detection, we only regard the low layers with enough details of targets in FPN as prediction layers. After several DCBs to deepen the network, prediction layers in our network can encode richer semantic information of targets, which can make the location of a target more precisely. In order to highlight visible parts of occluded pedestrians and ignore occluded parts, CWAM weights each channel of features with different importance. GAM aggregates global information and long-range dependencies for small-scale and occluded targets. Thus, the combination of CWAM and GAM is not only beneficial for coping with occlusion problem in pedestrian detection, but also for gaining environmental information for small-scale targets. Evaluation results on CUHK and CityPersons datasets show that our proposed method achieves improved performance with log-average miss rate reduction of 9.6% on the CUHK dataset and 6.1% on the Heavy subset of CityPersons dataset compared with FPN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.