Medical image analysis approaches such as data augmentation and domain adaption need huge amounts of realistic medical images. Generating realistic medical images by machine learning is a feasible approach. We propose L-former, a lightweight Transformer for realistic medical image generation. L-former can generate more reliable and realistic medical images than recent generative adversarial networks (GANs). Meanwhile, L-former does not consume as high computational cost as conventional Transformer-based generative models. L-former uses Transformers to generate low-resolution feature vectors at shallow layers, and uses convolutional neural networks to generate high-resolution realistic medical images at deep layers. Experimental results showed that L-former outperformed conventional GANs by FID scores 33.79 and 76.85 on two datasets, respectively. We further conducted a downstream study by using the images generated by L-former to perform a super-resolution task. A high PSNR score of 27.87 proved L-former's ability to generate reliable images for super-resolution and showed its potential for applications in medical diagnosis.
This paper proposes an automated classification method of COVID-19 chest CT volumes using improved 3D MLP-Mixer. Novel coronavirus disease 2019 (COVID-19) spreads over the world, causing a large number of infected patients and deaths. Sudden increase in the number of COVID-19 patients causes a manpower shortage in medical institutions. Computer-aided diagnosis (CAD) system provides quick and quantitative diagnosis results. CAD system for COVID-19 enables efficient diagnosis workflow and contributes to reduce such manpower shortage. In image-based diagnosis of viral pneumonia cases including COVID-19, both local and global image features are important because viral pneumonia cause many ground glass opacities and consolidations in large areas in the lung. This paper proposes an automated classification method of chest CT volumes for COVID-19 diagnosis assistance. MLP-Mixer is a recent method of image classification using Vision Transformer-like architecture. It performs classification using both local and global image features. To classify 3D CT volumes, we developed a hybrid classification model that consists of both a 3D convolutional neural network (CNN) and a 3D version of the MLP-Mixer. Classification accuracy of the proposed method was evaluated using a dataset that contains 1205 CT volumes and obtained 79.5% of classification accuracy. The accuracy was higher than that of conventional 3D CNN models consists of 3D CNN layers and simple MLP layers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.