The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.
Semantic segmentation is a high-level task in the field of computer vision, which paves the way for the realization of a complete understanding of the scene, and has been widely used in automatic driving, human-computer interaction, virtual reality, and other aspects. Recently, the semantic segmentation method of convolutional neural networks with deep structure has been more accurate and efficient than other methods. However, there are some problems in these methods, such as the loss of information caused by the down-sampling operation, the lack of usage of image context information, and the neglect of the relationship between spatial features and channel features. To solve these problems, a novel self-attention network based on the series-parallel structure is proposed in the paper. Firstly, a multi-scale dilated convolution backbone network is constructed by combining the dilated convolution and the residual network, which makes up for the information loss caused by the restriction of the receptor field in the ordinary network and improves the richness of extracted features. Secondly, the self-attention modules are stacked with serial and parallel structures, which can effectively extract the contextual information of space, channel, and space-channel and fully integrate them. Finally, the proposed algorithm is tested extensively and compared with the existing classical algorithms. The experimental results show that the proposed algorithm achieves state-of-the-art performance on the public dataset.
Instance segmentation is more challenging and difficult than object detection and semantic segmentation. It paves the way for the realization of a complete scene understanding, and has been widely used in robotics, automatic driving, medical care, and other aspects. However, there are some problems in instance segmentation methods, such as the low detection efficiency for low-resolution objects and the slow detection speed of images with complex backgrounds. To solve these problems, this paper proposes an instance segmentation method with multi-scale attention, which is called a Hybrid Kernel Mask R-CNN. Firstly, the hybrid convolution kernel is constructed by combining different kernels and groups, which can complement each other to extract rich information. Secondly, a multi-scale attention mechanism is designed by assign weights to different convolution kernels, which can retain more important information. After the introduction of our strategy, the network is more inclined to focus on the low-resolution objects in the image. The proposed method achieves the best accuracy over the anchor-based method. To verify the universality of the model, we test Hybrid Kernel Mask R-CNN on Balloon, xBD and COCO datasets. The test results exceed the state of art methods. And the visualization results show our method can extract low-resolution objects effectively.
The semantic segmentation of remote sensing images is a critical and challenging task. How to easily and reliably segment useful information from vast remote sensing images is a significant issue. Many methods based on convolutional neural networks have been widely explored to obtain more accurate segmentation from remote sensing images. However, due to the uniqueness of remote sensing images, such as the dramatic changes in the scale of the target object, the results are not satisfactory. To solve the problem, a special network is designed: (1) Create a new backbone network. Compared with ResNet50, the proposed method extracts features of varying sizes more effectively. (2) Reduce spatial information loss. Building a hybrid location module to compensate for the position loss caused by the down-sampling operation. (3) Models with high discriminant ability. In order to improve the discrimination ability of the model, a novel auxiliary loss function is designed to constrain the distance between inter-class and intra-class. The proposed algorithm is tested on remote sensing datasets (e.g., NWPU-45, DLRSD, and WHDLD). The experimental results show that this method obtains the best results and achieves state-of-the-art performance.
Remote sensing image change detection is to analyze the change information of two images from the same area at different times. It has wide applications in urban expansion, forest detection, and natural disaster. In this paper, Feature Fusion Network is proposed to solve the problems of slow change detection speed and low accuracy. The MobileNetV3 block is adopted to efficiently extract features and a self-attention module is applied to investigate the relationship between heterogeneous feature maps (image features and concatenated features). The method is tested in data sets SZTAKI and LEVIR-CD. With 98.43 percentage correct classification, it is better than other comparative networks, and its space complexity is reduced by about 50%. The experimental results show that it has better performance and can improve the accuracy or speed of change detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.