Object detection has made significant progress in many real-world scenes. Despite this remarkable progress, the common use case of detection in remote sensing images remains challenging even for leading object detectors, due to the complex background, objects with arbitrary orientation, and large difference in scale of objects. In this paper, we propose a novel rotation detector for remote sensing images, mainly inspired by Mask R-CNN, namely RADet. RADet can obtain the rotation bounding box of objects with shape mask predicted by the mask branch, which is a novel, simple and effective way to get the rotation bounding box of objects. Specifically, a refine feature pyramid network is devised with an improved building block constructing top-down feature maps, to solve the problem of large difference in scales. Meanwhile, the position attention network and the channel attention network are jointly explored by modeling the spatial position dependence between global pixels and highlighting the object feature, for detecting small object surrounded by complex background. Extensive experiments on two remote sensing public datasets, DOTA and NWPUVHR -10, show our method to outperform existing leading object detectors in remote sensing field.Remote Sens. 2020, 12, 389 2 of 20 for each region of interest (RoI). Then, the features are used to achieve category-specific classification and regression for the corresponding proposals. Finally, the final detection result is obtained through post-processing, such as non-maximum suppression. Faster R-CNN is a classical two-stage object detector, which is composed of Region Proposal Networks (RPN) and a detection network consists of classifiers and regressors, and can detect objects quickly and accurately in an end-to-end manner. Based on Faster R-CNN, more improved two-stage object detectors such as Region-based Fully Convolutional Networks (R-FCN) [7] and Mask R-CNN [8] were proposed. To further improve the efficiency of the object detector, Joseph Redmon et al. proposed a single stage target detector based on regression, called YOLO [9]. For the simple structure, You Only Look Once (YOLO) is extremely fast, but its accuracy is lower than that of the two-stage detector. Based on YOLO, YOLO v3 [10] and YOLO 9000 [11] were proposed successively. To trade off the detection speed and accuracy, Single Shot MultiBox Detector (SSD) [12] was proposed, whose speed and accuracy were between YOLO series algorithm and R-CNN series algorithm.
Object detection has always been a challenging task in the field of computer vision due to complex background, large scale variation and many small objects, which are especially pronounced for remote sensing imagery. In recent years, object detection in remote sensing with the development of deep learning has also made great breakthroughs. At present, almost all state-of-the-art object detectors rely on pre-defined anchor boxes for remote sensing imagery. However, too many anchor boxes will introduce a large number of hyper-parameters, which not only increase the memory footprint, but also increase the computational redundancy of the detection model. In contrast, we propose an anchor-free single-stage detector for remote sensing imagery object detection, avoiding a large number of hyper-parameters related to the anchor box, which usually affect the performance of the detection model. Specially, considering the large-scale differences in the objects and the characteristics of small objects in remote sensing imagery, we design a dense path aggregation feature pyramid network (DPAFPN), which can make full use of the high-level semantic information and low-level location information in remote sensing imagery, and to a certain extent, avoid information loss during shallow feature transfer. In our experiments, extensive experiments on two public remote sensing datasets DOTA, NWPU VHR-10 were conducted. The experimental results demonstrate that our detector has good performance and is meaningful for object detection in remote sensing imagery. INDEX TERMS Remote sensing, deep learning, anchor-free, object detection.
Object detection in remote sensing images has been widely used in military and civilian fields and is a challenging task due to the complex background, large-scale variation, and dense arrangement in arbitrary orientations of objects. In addition, existing object detection methods rely on the increasingly deeper network, which increases a lot of computational overhead and parameters, and is unfavorable to deployment on the edge devices. In this paper, we proposed a lightweight keypoint-based oriented object detector for remote sensing images. First, we propose a semantic transfer block (STB) when merging shallow and deep features, which reduces noise and restores the semantic information. Then, the proposed adaptive Gaussian kernel (AGK) is adapted to objects of different scales, and further improves detection performance. Finally, we propose the distillation loss associated with object detection to obtain a lightweight student network. Experiments on the HRSC2016 and UCAS-AOD datasets show that the proposed method adapts to different scale objects, obtains accurate bounding boxes, and reduces the influence of complex backgrounds. The comparison with mainstream methods proves that our method has comparable performance under lightweight.
The coronavirus disease (COVID-19) has been spreading rapidly around the world. As of August 25, 2020, 23.719 million people have been infected in many countries. The cumulative death toll exceeds 812,000. Early detection of COVID-19 is essential to provide patients with appropriate medical care and protect uninfected people. Leveraging a large computed tomography (CT) database from 1,112 patients provided by China Consortium of Chest CT Image Investigation (CC-CCII), we investigated multiple solutions in detecting COVID-19 and distinguished it from other common pneumonia (CP) and normal controls. We also compared the performance of different models for complete and segmented CT slices. In particular, we studied the effects of CT-superimposition depths into volumes on the performance of our models. The results show that the optimal model can identify the COVID-19 slices with 99.76% accuracy (99.96% recall, 99.35% precision and 99.65% F1-score). The overall performance for three-way classification obtained 99.24% accuracy and the area under the receiver operating characteristic curve (AUROC) of 0.9986. To the best of our knowledge, our method achieves the highest accuracy and recall with the largest public available COVID-19 CT dataset. Our model can help radiologists and physicians perform rapid diagnosis, especially when the healthcare system is overloaded.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.