The model, Transformer, is known to rely on a self-attention mechanism to model distant dependencies, which focuses on modeling the dependencies of the global elements. However, its sensitivity to the local details of the foreground information is not significant. Local detail features help to identify the blurred boundaries in medical images more accurately. In order to make up for the defects of Transformer and capture more abundant local information, this paper proposes an attention and MLP hybrid-encoder architecture combining the Efficient Attention Module (EAM) with a Dual-channel Shift MLP module (DS-MLP), called HEA-Net. Specifically, we effectively connect the convolution block with Transformer through EAM to enhance the foreground and suppress the invalid background information in medical images. Meanwhile, DS-MLP further enhances the foreground information via channel and spatial shift operations. Extensive experiments on public datasets confirm the excellent performance of our proposed HEA-Net. In particular, on the GlaS and MoNuSeg datasets, the Dice reached 90.56% and 80.80%, respectively, and the IoU reached 83.62% and 68.26%, respectively.
Doctors usually diagnose a disease by evaluating the pattern of abnormal blood vessels in the fundus. At present, the segmentation of fundus blood vessels based on deep learning has achieved great success, but it still faces the problems of low accuracy and capillary rupture. A good vessel segmentation method can guide the early diagnosis of eye diseases, so we propose a novel hybrid Transformer network (HT-Net) for fundus imaging analysis. HT-Net can improve the vessel segmentation quality by capturing detailed local information and implementing long-range information interactions, and it mainly consists of the following blocks. The feature fusion block (FFB) is embedded in the shallow levels, and FFB enriches the feature space. In addition, the feature refinement block (FRB) is added to the shallow position of the network, which solves the problem of vessel scale change by fusing multi-scale feature information to improve the accuracy of segmentation. Finally, HT-Net’s bottom-level position can capture remote dependencies by combining the Transformer and CNN. We prove the performance of HT-Net on the DRIVE, CHASE_DB1, and STARE datasets. The experiment shows that FFB and FRB can effectively improve the quality of microvessel segmentation by extracting multi-scale information. Embedding efficient self-attention mechanisms in the network can effectively improve the vessel segmentation accuracy. The HT-Net exceeds most existing methods, indicating that it can perform the task of vessel segmentation competently.
Insulators are widely used in various aspects of the power system and play a crucial role in ensuring the safety and stability of power transmission. Insulator detection is an important measure to guarantee the safety and stability of the transmission system, and accurate localization of insulators is a prerequisite for detection. In this paper, we propose an improved method based on the YOLOv5s model to address the issues of slow localization speed and low accuracy in insulator detection in power systems. In our approach, we first re-cluster the insulator image samples using the k-means algorithm to obtain different sizes of anchor box parameters. Then, we add the non-local attention module (NAM) to the feature extraction module of the YOLOv5s algorithm. The NAM improves the attention mechanism using the weights’ contribution factors and scaling factors. Finally, we recursively replace the ordinary convolution module in the neck network of the YOLOv5 model with the gated normalized convolution (gnConv). Through these improvements, the feature extraction capability of the network is enhanced, and the detection performance of YOLOv5s is improved, resulting in increased accuracy and speed in insulator defect localization. In this paper, we conducted training and evaluation on a publicly available dataset of insulator defects. Experimental results show that the proposed improved YOLOv5s model achieves a 1% improvement in localization accuracy compared to YOLOv5. The proposed method balances accuracy and speed, meeting the requirements of online insulator localization in power system inspection.
With the development of deep learning, convolutional neural networks (CNNs) and Transformer-based methods have become key techniques for medical image classification tasks. However, many current neural network models have problems such as high complexity, a large number of parameters, and large model sizes; such models obtain higher classification accuracy at the expense of lightweight networks. Moreover, such larger-scale models pose a great challenge for practical clinical applications. Meanwhile, Transformer and multi-layer perceptron (MLP) methods have some shortcomings in terms of local modeling capability and high model complexity, and need to be used on larger datasets to show good performance. This makes it difficult to utilize these networks in clinical medicine. Based on this, we propose a lightweight and efficient pure CNN network for medical image classification (Eff-PCNet). On the one hand, we propose a multi-branch multi-scale CNN (M2C) module, which divides the feature map into four parallel branches along the channel dimensions by a certain scale factor and carries out a deep convolution operation using different scale convolution kernels, and this multi-branch multi-scale operation effectively replaces the large kernel convolution. This multi-branch multi-scale operation effectively replaces the large kernel convolution. It reduces the computational cost of the module while fusing the feature information between different channels and thus obtains richer feature information. Finally, the four feature maps are then spliced along the channel dimensions to fuse the multi-scale and multi-dimensional feature information. On the other hand, we introduce the structural reparameterization technique and propose the structural reparameterized CNN (Rep-C) module. Specifically, it utilizes multiple linear operators to generate different feature maps during the training process and fuses all the participants into one through parameter fusion to achieve fast inference while providing a more effective solution for feature reuse. A number of experimental results show that our Eff-PCNet performs better than current methods based on CNN, Transformer, and MLP in the classification of three publicly available medical image datasets. Among them, we achieve 87.4% Acc on the HAM10000 dataset, 91.06% Acc on the SkinCancer dataset, and 97.03% Acc on the Chest-Xray dataset. Meanwhile, our approach achieves a better trade-off between the number of parameters; computation; and other performance metrics as well.
Currently, deep learning is the mainstream method to solve the problem of person reidentification. With the rapid development of neural networks in recent years, a number of neural network frameworks have emerged for it, so it is becoming more important to explore a simple and efficient baseline algorithm. In fact, the performance of the same module varies greatly in different positions of the network architecture. After exploring how modules can play a maximum role in the network and studying and summarizing existing algorithms, we designed an adaptive multiple loss baseline (AML) with a simple structure but powerful functions. In this network, we use an adaptive mining sample loss (AMS) and other modules, which can mine more information from input samples at the same time. Based on triplet loss, AMS loss can optimize the distance between the input sample and its positive and negative samples and protect structural information within the sample. During the experiment, we conducted several group tests and confirmed the high performance of AML baseline via the results. AML baseline has outstanding performance in three commonly used datasets. The two indicators of AML baseline on CUHK-03 are 25.7% and 26.8% higher than BagTricks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.