Texture feature is an important visual cue for an image, which is the unified description of human visual attributes and sensory attributes. The inherent problem of texture image is that the difference of intra-class images is large and the disparity of inter-class images is small. This problem increases the difficulty of texture image recognition. Therefore, improving the relevance embedding of intra-class images can reduce the classification errors caused by this problem. To solve this problem, this paper proposes a multi-scale information fusion network algorithm, which adopts a cascade structure. It combines multi-scale feature information with the corresponding background information. The shallow background information guides the next stage of feature formation and enhances the similarity of intra-class images. The intra-class feature information obtained is more general. The algorithm has been tested on data sets describable texture database (DTD) and Flickr material dataset (FMD), which has achieved good results.
Herein, a dual-branch semantic segmentation model based on depth-separable convolution and attention mechanism is proposed for the real-time and accuracy requirement of semantic segmentation. The proposed approach overcomes the problems of poor segmentation effect and over-simplification of feature fusion arising from the constant downsample operations in semantic segmentation. The network is divided into spatial detail and semantic information paths. The spatial detail path utilizes a smaller downsample multiplier to maintain resolution and efficiently extract spatial information. The semantic information path is constructed by a non-bottleneck residual unit with dilated convolution; it extracts semantic features. For the feature aggregation problem, the feature-guided fusion module is designed to assign different weights to the parts of the two paths and fuse them to obtain the final output. The proposed algorithm achieves a segmentation accuracy of 69.6% and speed of 70 fps on the Cityscapes dataset, with a model parameter count of only 0.76 M, thus indicating some advantages over recent real-time semantic segmentation algorithms. The proposed method with depth separable convolution and attention mechanism can effectively extract features and compensate for the loss of accuracy caused by downsampling. The experiments demonstrate that the proposed fusion module outperforms other methods in fusing different features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.