Tumour segmentation in medical images (especially 3D tumour segmentation) is highly challenging due to the possible similarity between tumours and adjacent tissues, occurrence of multiple tumours and variable tumour shapes and sizes. The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network (CNN) and Transformer. The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data. The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN. To address these issues, a short‐term and long‐term memory self‐attention network is proposed. Firstly, a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN. Then, the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours. Finally, the multi‐layer reconstruction blocks will predict the tumour boundaries. Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation. Compared with the most competitive method, the proposed method provides Dice (82.4% vs. 76.6%) and Hausdorff distance 95% (HD95) (10.66 vs. 11.54 mm) on the KiTS19 as well as Dice (80.2% vs. 78.4%) and HD95 (9.632 vs. 12.17 mm) on the LiTS.
Optical coherence tomography (OCT) has found wide application to the diagnosis of ophthalmic diseases, but the quality of OCT images is degraded by speckle noise. The convolutional neural network (CNN) based methods have attracted much attention in OCT image despeckling. However, these methods generally need noisy-clean image pairs for training and they are difficult to capture the global context information effectively. To address these issues, we have proposed a novel unsupervised despeckling method. This method uses the cross-scale CNN to extract the local features and uses the intra-patch and inter-patch based transformer to extract and merge the local and global feature information. Based on these extracted features, a reconstruction network is used to produce the final denoised result. The proposed network is trained using a hybrid unsupervised loss function, which is defined by the loss produced from Nerighbor2Neighbor, the structural similarity between the despeckled results of the probabilistic non-local means method and our method as well as the mean squared error between their features extracted by the VGG network. Experiments on two clinical OCT image datasets show that our method performs better than several popular despeckling algorithms in terms of visual evaluation and quantitative indexes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.