With an ever-increasing resolution of optical remote-sensing images, how to extract information from these images efficiently and effectively has gradually become a challenging problem. As it is prohibitively expensive to label every object in these high-resolution images manually, there is only a small number of high-resolution images with detailed object labels available, highly insufficient for common machine learning-based object detection algorithms. Another challenge is the huge range of object sizes: it is difficult to locate large objects, such as buildings and small objects, such as vehicles, simultaneously. To tackle these problems, we propose a novel neural network based remote sensing object detector called full-coverage collaborative network (FCC-Net). The detector employs various tailored designs, such as hybrid dilated convolutions and multi-level pooling, to enhance multiscale feature extraction and improve its robustness in dealing with objects of different sizes. Moreover, by utilizing asynchronous iterative training alternating between strongly supervised and weakly supervised detectors, the proposed method only requires image-level ground truth labels for training. To evaluate the approach, we compare it against a few state-of-the-art techniques on two large-scale remote-sensing image benchmark sets. The experimental results show that FCC-Net significantly outperforms other weakly supervised methods in detection accuracy. Through a comprehensive ablation study, we also demonstrate the efficacy of the proposed dilated convolutions and multi-level pooling in increasing the scale invariance of an object detector.
To solve the problems of poor authenticity and lack of clarity for short-time typhoon prediction, we propose a self-attentional spatiotemporal adversarial network (SASTA-Net). First, we introduce a multispatiotemporal feature fusion method to fully extract and fuse the multichannel spatiotemporal feature information to effectively enhance feature expression. Second, we propose an SATA-LSTM prediction model that incorporates spatial memory cell and attention mechanisms in order to capture spatial features and important details in sequences. Finally, a spatiotemporal 3D discriminator is designed to correctly distinguish the generated predicted cloud image from the real cloud image and generate a more accurate and real typhoon cloud image by adversarial training. The evaluation results on the typhoon cloud image data set show that the proposed SASTA-Net achieves 67.3, 0.878, 31.27, and 56.48 in mean square error, structural similarity, peak signal to noise ratio, and sharpness, respectively, which is superior to the most advanced prediction algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.