2023
DOI: 10.1088/1361-6560/acc000
|View full text |Cite
|
Sign up to set email alerts
|

CTformer: convolution-free Token2Token dilated vision transformer for low-dose CT denoising

Abstract: Low-dose computed tomography (LDCT) denoising is an important problem in CT research. Compared to the normal dose CT (NDCT), LDCT images are subjected to severe noise and artifacts. Recently in many studies, vision transformers have shown superior feature representation ability over convolutional neural networks (CNNs). However, unlike CNNs, the potential of vision transformers in LDCT denoising was little explored so far. To fill this gap, we propose a Convolution-free Token2Token Dilated Vision Transformer (… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 81 publications
(18 citation statements)
references
References 56 publications
0
18
0
Order By: Relevance
“…In the experiment, the CT images of 9 patients (760 pairs) were used as the training set, and the images of 1 patient (35 pairs) were used as the test set. To verify further the effectiveness of the method in this study, we performed a comparison of the proposed network with seven denoising networks, namely REDCNN, 14 EDCNN, 28 wavelet- and CNN-based wavelet domain residual network (WavResNet), 37 GAN network-based WGAN_VGG, 19 Pix2Pix, 38 CNCL, 22 and CTformer 23 for visual effectiveness and quantitative evaluation. The quantitative metrics include SSIM 39 based on structural differences, peak signal-to-noise ratio (PSNR) based on pixel gray scale differences, gradient magnitude similarity deviation (GMSD) 40 based on gradient variation, feature similarity index measure (FSIM) 41 based on feature differences, and multiscale pixel domain implementation (VIFS) 42 based on visual perception.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In the experiment, the CT images of 9 patients (760 pairs) were used as the training set, and the images of 1 patient (35 pairs) were used as the test set. To verify further the effectiveness of the method in this study, we performed a comparison of the proposed network with seven denoising networks, namely REDCNN, 14 EDCNN, 28 wavelet- and CNN-based wavelet domain residual network (WavResNet), 37 GAN network-based WGAN_VGG, 19 Pix2Pix, 38 CNCL, 22 and CTformer 23 for visual effectiveness and quantitative evaluation. The quantitative metrics include SSIM 39 based on structural differences, peak signal-to-noise ratio (PSNR) based on pixel gray scale differences, gradient magnitude similarity deviation (GMSD) 40 based on gradient variation, feature similarity index measure (FSIM) 41 based on feature differences, and multiscale pixel domain implementation (VIFS) 42 based on visual perception.…”
Section: Resultsmentioning
confidence: 99%
“…The network has strong generalization ability and has good performance on CT, magnetic resonance, and positron emission tomography images. Wang et al 23 . proposed a pure transformer CTformer for LDCT denoising, which was a pioneer in applying visual transformer to LDCT denoising problems and introduced expansion and cyclic displacement to capture longer-range interactions.…”
Section: Introductionmentioning
confidence: 99%
“…It is known that, the core component of Transformer model is the self-attention module which can capture the global dependences of patches by conducting message passing on different patches to obtain more contextual representations. For example, Wang et al (2023a) propose a Convolution-free Token2Token Dilated Vision Transformer (CTformer) based on more powerful token rearrangement to encompass local contextual information. Wang et al (2021a) propose a learning model based on triple attention on both spatial and channel dimensions.…”
Section: Introductionmentioning
confidence: 99%
“…2 Common to all of these in vivo multi-channel imaging applications is the need to manage ionizing radiation dose applied to the subject. In the past, iterative reconstruction techniques, including regularization based on prior assumptions, were the preeminent means of reducing dose and associated image noise in CT. More recently, however, there has been an explosion of interest in supervised deep learning (DL) methods for CT image noise removal and dose management [3][4][5] and for the augmentation of iterative reconstruction. [6][7][8] Supervised DL methods offer several key advantages over classic denoising methods and iterative reconstruction techniques.…”
Section: Introductionmentioning
confidence: 99%