Using the deep learning model to realize tongue image-based disease location recognition and focus on solving two problems: 1. The ability of the general convolution network to model detailed regional tongue features is weak; 2. Ignoring the group relationship between convolution channels, which caused the high redundancy of the model. Methods: To enhance the convolutional neural networks. In this paper, a stochastic region pooling method is proposed to gain detailed regional features. Also, an inner-imaging channel relationship modeling method is proposed to model multi-region relations on all channels. Moreover, we combine it with the spatial attention mechanism. Results: The tongue image dataset with the clinical disease-location label is established. Abundant experiments are carried out on it. The experimental results show that the proposed method can effectively model the regional details of tongue image and improve the performance of disease location recognition.
Conclusion:In this paper, we construct the tongue image dataset with disease-location labels to mine the relationship between tongue images and disease locations. A novel fully-channel regional attention network is proposed to model the local detail tongue features and improve the modeling efficiency. Significance: The applications of deep learning in tongue image disease-location recognition and the proposed innovative models have guiding significance for other assistant diagnostic tasks. The proposed model provides an example of efficient modeling of detailed tongue features, which is of great guiding significance for other auxiliary diagnosis applications.
Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address that by enhancing the diversities of filters, they have not considered the complementarity and the completeness of the internal convolutional structure. To respond to this problem, we propose a novel inner-imaging (InI) architecture, which allows relationships between channels to meet the above requirement. Specifically, we organize the channel signal points in groups using convolutional kernels to model both the intragroup and intergroup relationships simultaneously. A convolutional filter is a powerful tool for modeling spatial relations and organizing grouped signals, so the proposed methods map the channel signals onto a pseudoimage, like putting a lens into the internal convolution structure. Consequently, not only is the diversity of channels increased but also the complementarity and completeness can be explicitly enhanced. The proposed architecture is lightweight and easy to be implement. It provides an efficient self-organization strategy for convolutional networks to improve their performance. Extensive experiments are conducted on multiple benchmark datasets, including CIFAR, SVHN, and ImageNet. Experimental results verify the effectiveness of the InI mechanism with the most popular convolutional networks as the backbones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.