The Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current target detection field. It has achieved good results in target detection but there are problems such as poor extraction of features in shallow layers and loss of features in deep layers. In this paper, we propose an accurate and efficient target detection method, named Single Shot Object Detection with Feature Enhancement and Fusion (FFESSD), which is to enhance and exploit the shallow and deep features in the feature pyramid structure of the SSD algorithm. To achieve it we introduced the Feature Fusion Module and two Feature Enhancement Modules, and integrated them into the conventional structure of the SSD. Experimental results on the PASCAL VOC 2007 dataset demonstrated that FFESSD achieved 79.1% mean average precision (mAP) at the speed of 54.3 frame per second (FPS) with the input size 300 × 300, while FFESSD with a 512 × 512 sized input achieved 81.8% mAP at 30.2 FPS. The proposed network shows state-of-the-art mAP, which is better than the conventional SSD, Deconvolutional Single Shot Detector (DSSD), Feature-Fusion SSD (FSSD), and other advanced detectors. On extended experiment, the performance of FFESSD in fuzzy target detection was better than the conventional SSD.
The ability to detect small targets and the speed of the target detector are very important for the application of remote sensing image detection, and in this paper, we propose an effective and efficient method (named CISPNet) with high detection accuracy and compact architecture. In particular, according to the characteristics of the data, we apply a context information scene perception (CISP) module to obtain the contextual information for targets of different scales and use k-means clustering to set the aspect ratios and size of the default boxes. The proposed method inherits the network structure of Single Shot MultiBox Detector (SSD) and introduces the CISP module into it. We create a dataset in the Pascal Visual Object Classes (VOC) format, annotated with the three types of detection targets, aircraft, ship, and oiltanker. Experimental results on our remote sensing image dataset as well as the Northwestern Polytechnical University very-high-resolution (NWPU VRH-10) dataset demonstrate that the proposed CISPNet performs much better than the original SSD and other detectors especially for small objects. Specifically, our network can achieve 80.34% mean average precision (mAP) at the speed of 50.7 frames per second (FPS) with the input size 300 × 300 pixels on the remote sensing image dataset. On extended experiments, the performance of CISPNet in fuzzy target detection in remote sensing image is better than that of SSD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.