MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion Segmentation

Ruan, Jiacheng; Xiang, Suncheng; Xie, Mingye; Liu, Ting; Fu, Yuzhuo

doi:10.1109/bibm55620.2022.9995040

Cited by 72 publications

(20 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Compared with most of the most advanced methods, better segmentation results are obtained. Methods Dice Jaccard Accuracy U-Net [ 1 2 ] 77.03 67.15 90.52 SwinUNet [ 1 9 ] 77.29 66.51 91.21 UCTransNet [ 2 0 ] 78.90 69.35 91.29 TransUNet [ 1 8 ] 83.58 74.33 92.84 SkinNet [2 7 ] 85.50 76.70 93.20 DCL-PSI [ 2 8 ] 85.60 77.70 94.00 PA-Net [ 2 9 ] 85.80 77.60 93.60 iMSCGnet [ 3 0 ] 85.83 77.75 93.58 FrCN [ 3 1 ] 87.08 77.11 94.03 UTNetV2 [ 3 2 ] 87.23 77.35 95.84 UNeXt-S [ 3 3 ] 87.80 78.26 95.95 MALUNet [ 3 4 ] 88.13 78.78 96.18 TransFuse [ 3 5 ] 88 [12] 83.62 75.17 91.75 UCTransNet [20] 86.41 78.60 93.13 SwinUNet [19] 86.02 78.21 92.96 UNet++ [36] 87.83 78.31 94.02 Attention-UNet [37] 87.91 78.43 94.13 UTNetV2 [32] 88.25 78.97 94.32 MSRF-Net [38] 88.13 ----TransUNet [18] 88.32 81.25 93.67 UNeXt-S [33] 88.33 79.09 94.39 SANet [39] 88.59 79.52 94.39 TransFuse [35] 89.27 80.63 94.66 MALUNet [34] 89.04 80.25 94.62 MCGU-Net [40] 89.50 --95.50 DoubleU-Net [41] 89.62 ----FAuNet 90.42 84.23 94.93…”

Section: Compared With the Representative Methodsmentioning

confidence: 99%

Feature augmentation with transformer for medical image segmentation based on self-attention mechanism

Fan,

Peng,

Xie

et al. 2024

Fifteenth International Conference on Graphics and Image Processing (ICGIP 2023)

View full text Add to dashboard Cite

Melanoma is one of the most malignant tumors in the world, with high morbidity and mortality, and poses a serious threat to human health. Thanks to the powerful learning ability of deep learning technology, deep learning method based on convolutional neural network has made remarkable progress in the field of melanoma image segmentation, but the segmentation results are poor when the skin condition is complex and the image is not ideal. In this paper, a feature augmentation network with Transformer termed FAuNet is designed for melanoma image segmentation by embedding reconstruction feature (Refa) module, enhanced feature pyramid (EFP) module, and communicating shallow features with the encoder segment. At the encoder stage, the features are first passed through the Transformer model to make it have global information while retaining the shallowest information to establish the association between the encoder and decoder. After that, Refa module is used to recover the features step by step, and EFP module is combined to connect the features horizontally to enrich the feature information. Then, the shallow features of encoding layer and decoding layer are fused to make up for the spatial information loss caused by downsampling and upsampling. Experiments on ISIC2017 and ISIC2018 datasets demonstrate the proposed method performs favorably against the state-of-the-art methods.

show abstract

Section: Compared With the Representative Methodsmentioning

confidence: 99%

Feature augmentation with transformer for medical image segmentation based on self-attention mechanism

Fan,

Peng,

Xie

et al. 2024

Fifteenth International Conference on Graphics and Image Processing (ICGIP 2023)

View full text Add to dashboard Cite

show abstract

“…The mask segmentor identifier ( SI ) ( Figure 3c ) takes the output from the FiLM decoder

as input and generates predicted segmentation mask

0,1

, where

is the number of categories (RV, LV, LV-Myo, and background) in the training dataset. We exploit a novel supervised loss, weighted soft background focal (WSBF) loss,

for the base model, which is a combination of background focal dice loss (BFD) and weighted soft focal loss (WSFL):

where

and

are designed to account for class imbalance and are treated as hyperparameters, the term

is used to down-weigh examples with backgrounds, where

varies in the range [ 1 , 3 ]. The term

11

denotes the cross-entropy loss.…”

Section: Methodsmentioning

confidence: 99%

“…The emerging success of deep convolutional neural networks (CNNs) has rendered them the de facto model in solving high-level computer vision tasks [ 1 – 3 ]. However, such approaches mostly rely on large amounts of annotated data for training, the acquisition of which is expensive and laborious, especially for medical imaging/diagnostic radiology data.…”

Section: Introductionmentioning

confidence: 99%

Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning

Hasan

Linte

2022

Applied Sciences

View full text Add to dashboard Cite

Learning good data representations for medical imaging tasks ensures the preservation of relevant information and the removal of irrelevant information from the data to improve the interpretability of the learned features. In this paper, we propose a semi-supervised model—namely, combine-all in semi-supervised learning (CqSL)—to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two important tasks in medical imaging: segmentation and reconstruction. Our work is motivated by the recent progress in image segmentation using semi-supervised learning (SSL), which has shown good results with limited labeled data and large amounts of unlabeled data. A disentanglement block decomposes an input image into a domain-invariant spatial factor and a domain-specific non-spatial factor. We assume that medical images acquired using multiple scanners (different domain information) share a common spatial space but differ in non-spatial space (intensities, contrast, etc.). Hence, we utilize our spatial information to generate segmentation masks from unlabeled datasets using a generative adversarial network (GAN). Finally, to reconstruct the original image, our conditioning layer-based reconstruction block recombines spatial information with random non-spatial information sampled from the generative models. Our ablation study demonstrates the benefits of disentanglement in holding domain-invariant (spatial) as well as domain-specific (non-spatial) information with high accuracy. We further apply a structured L2 similarity (SL2SIM) loss along with a mutual information minimizer (MIM) to improve the adversarially trained generative models for better reconstruction. Experimental results achieved on the STACOM 2017 ACDC cine cardiac magnetic resonance (MR) dataset suggest that our proposed (CqSL) model outperforms fully supervised and semi-supervised models, achieving an 83.2% performance accuracy even when using only 1% labeled data. We hypothesize that our proposed model has the potential to become an efficient semantic segmentation tool that may be used for domain adaptation in data-limited medical imaging scenarios, where annotations are expensive. Code, and experimental configurations will be made available publicly.

show abstract

“…A straightforward method with a small number of parameters and low complexity was proposed by Ruan et al. [38]. In order to improve model performance while significantly reducing model parameters and computational complexity, the model combines four unique attention modules with a UNet architecture.…”

Section: Related Workmentioning

confidence: 99%

MSPAN: Multi‐scale pyramid attention network for efficient skin cancer lesion segmentation

Ahmed,

Xin,

Lizhuang

2024

IET Image Processing

View full text Add to dashboard Cite

Skin cancer is common and deadly, needs to be detected and treated properly. Deep learning algorithms like UNet have shown potential results in medical imaging. Such approaches still struggle to capture fine‐grained details and scale differences in skin lesions‐based occlusions' appearance, size etc. This research proposes a redesign UNet, the Multi‐Scale Pyramid Attention Network (MSPAN), to improve skin cancer lesion segmentation. The input data is processed at numerous scales with varied receptive fields. This enhances the network's ability to identify lesion locations by capturing local and global context. Attention approaches also help the network to suppress noise by focusing on informative features. We have evaluated MSPAN model on the publicly available ISIC2018 benchmark dataset for skin lesion segmentation. The method surpasses traditional UNet and other current methods in accuracy and effectiveness. The model also has a post‐processing to estimate lesion area for fast inference, making it suitable for extensive screening. Redesigned UNet with the Multi‐Scale Pyramid Attention Network improves skin cancer lesion segmentation. The model's ability to collect fine‐grained information and handle occlusions allows for more accurate skin cancer diagnosis and treatment. The MSPAN design can improve computer‐aided diagnosis systems and help dermatologists make precise clinical decisions.

show abstract

MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion Segmentation

Cited by 72 publications

References 25 publications

Feature augmentation with transformer for medical image segmentation based on self-attention mechanism

Feature augmentation with transformer for medical image segmentation based on self-attention mechanism

Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning

MSPAN: Multi‐scale pyramid attention network for efficient skin cancer lesion segmentation

Contact Info

Product

Resources

About